-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
syscalls: use gperf for lookups #204
Conversation
@pcmoore could you please take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this could be a good improvement for applications that are quickly creating/deleting containers with seccomp filters on them. Nice work.
Hi @giuseppe, thanks for you work on this. I think this goes well with some other work I wanted to get done with libseccomp, but as I'm looking for the associated issue I realize that I never created one; the only reference I had outside my head was a discussion with @drakenclimber one day at lunch :/ Let me create that issue today and then we can talk further. |
Okay, take a look at #207 and let me know what you think. |
Also, if someone wants to work on this and #207, I think we could consider this for the v2.5 milestone/release. (This would make @drakenclimber and my lives much easier from a maintenance perspective, so anyone willing to work on this get's a large number of gold stars!) |
thanks for writing this up. I think it makes a lot of sense to not spread the same information among different files. Would you be ok with having the .perf file as the single source instead of a .csv file? It will be really easy to generalize the x86_64 code to handle multiple archs. We will need to add a column for each arch, in the same way as you've proposed for the .csv file, so it will look more like:
We still need the index column as gperf doesn't keep the same ordering we provide. If you prefer the csv file then we would need to generate the .perf file based on that. The main disadvantage with using directly gperf is that the lookup by syscall number is still |
Well, a CSV file easy to manipulate with any number of tools, the *.perf file format looks a bit more convoluted. Besides ...
Is using the *.perf file format as the source even desirable given what you've written above? The CSV file format is tool agnostic and gives us plenty of options for other "stuff", including this gperf optimization. As of right now I think the CSV source format is our best option of the two. |
I was not sure as I didn't want to add yet another level of source generation. I went forward and added a .csv file. The perf file is generated from there. I've also adapted the code for arm, as an example of how it is possible to move existing archs to gperf. |
I'm going to preface this by saying I know almost nothing about gperf at the moment, but it would be really nice if we could limit the amount of C code in the "syscalls.perf.template" file to close-to-nothing. It looks like we would need I would also like to see the "syscalls.csv to syscalls.perf" generation commands encapsulated into a separate script instead of open-coded into the makefile. It would also be nice to drop the quotes around the syscall names in the CSV file. Let's also convert the "__PNR_xxx" references into something generic like just "PNR"; the conversion script can change that into the appropriate "__PNR_xxx" value. Further, can we get rid of the explicit offsets in the CSV, e.g. "(__SCMP_NR_BASE + 140)"? It seems like the conversion script could always assume an offset and generate the "(__SCMP_NR_BASE + xxx)" C code and it would be up to the ABI definition to define the offset; for ABIs like x86 that offset could be zero. |
Side note: you also need to find a way to get your code coverage numbers back up. It might just be a symptom of the PR being a work-in-progress, but we've worked way too hard to have it the coverage numbers drop down into the 60% range. |
a2427a2
to
32adf86
Compare
can we do these cleanups for the .csv file incrementally? I've taken care of the other comments, and also the coverage seems to be again ok now. |
I would like to see them all in the same PR. The same goes for supporting all the arch/ABIs. I'm not a big fan of incomplete work where there is just a promise to do it later ;) |
I wasn't even promising that :-) I thought it was fine to just add it for x86_64 and other arches could be adapted later by whoever cares about them. Can we just use the existing values to populate the csv for now? I am not exactly sure how you'd like to rewrite them an it seems like an extra step that could be done separately. Some arches, like s390, have some extra logic in |
I've pushed a new version where I've adapted all the architectures |
Thanks for renaming test 56 :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a couple small comments and questions, but I think the code looks really good. I'm excited for the performance improvements.
With this large of a change, @pcmoore and I will have to be extra diligent when we validate the syscalls for the next kernel release, but that doesn't seem tremendously onerous.
93c3a77
to
5e6f2eb
Compare
use gperf to generate a perfect hash to lookup syscall names. It improves significantly the complexity for seccomp_syscall_resolve_name.* since it replaces the expensive strcmp for each syscall in the database, with a lookup table. The complexity for syscall_resolve_num is not changed and it uses the linear search, that is anyway less expensive than seccomp_syscall_resolve_name.* as it uses an index for comparison instead of doing a string comparison. On my machine, calling 1000 seccomp_syscall_resolve_name_arch and seccomp_syscall_resolve_num_arch over the entire syscalls DB passed from ~0.45 sec to ~0.06s. Closes: seccomp#207 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@giuseppe @pcmoore @drakenclimber Is this ready to merge? Would love to get this merged and a new release out, so we could take advantage. |
Awesome! Thanks for all the hard work @giuseppe
|
@pcmoore does the last version look good to you? |
Hi @giuseppe, can you explain the changes to src/arch-syscall-validate? |
Actually nevermind; I think I know what you were doing and that is just hiding some underlying problems with these changes and the syscalls table. Let me fixup arch-syscall-validate properly so we can use it to generate the CSV from the kernel sources. We are going to need to do that anyway, and it should avoid having to do what you did to hide problems in the CSV. |
@giuseppe @drakenclimber I think this is what we need for arch-syscall-validate: It's really crude but it works (hint: run it with the '-c' option and point it at a recent kernel source tree):
@giuseppe if that looks okay to you, you can include that in your patchset as a second patch or I can add it when I merge your patch; either approach is fine with me. @drakenclimber as you mentioned previously, we are going to really need to verify that we haven't screwed up the syscall tables with this change. Thankfully this should be a one-time event, moving forward we can just generate a new CSV file for each release (and give the diff a quick sanity check of course). |
I think the issue was in the previous implementation for some archs. For example s390 had:
so the value returned by
|
@pcmoore I am fine with including your patch in the PR, but I think we still need the filters I've added initially to Are you fine with something like:
on top of your patch? With the patch above |
Sorta, there is an issue in s390 and s390x where we aren't handling the syscall lookup correctly, see the discussion in #215 for more information. This of course is a reminder that we really need to be able to specify ABI specific lookup functions, e.g. @giuseppe do you mind if I spend some time hacking on your patch/PR? I'm not sure what my weekend is looking like (COVID-19 has introduced a lot of uncertainty in the world), but I may have some time to work on this if that is okay with you. Of course if you would prefer to fix it yourself, please do! |
See my previous comment, those filters are hiding a problem; we do not want them. As I was reworking that patch (you had the right idea with only dumping the ABI lists once, but we really should use @drakenclimber I'm going to be forgiveness on this one and merge the |
of course I won't mind that :) |
closing in favor of #223 |
use gperf to generate a perfect hash to lookup syscall names. It
improves significantly the complexity for seccomp_syscall_resolve_name.*
since it replaces the expensive strcmp for each syscall in the
database, with a lookup table.
The complexity for x86_64_syscall_resolve_num is not changed and it
uses the linear search, that is anyway less expensive than
seccomp_syscall_resolve_name.* as it uses an index for comparison
instead of doing a string comparison.
On my machine, calling 1000 seccomp_syscall_resolve_name_arch and
seccomp_syscall_resolve_num_arch over the entire syscalls DB passed
from ~0.45 sec to ~0.06s.
I've implemented it only for x86_64 as I've no access to other archs
for benchmarking the results.
Signed-off-by: Giuseppe Scrivano gscrivan@redhat.com