-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenZFS 7431 - ZFS Channel Programs #6558
Conversation
There is no native setjmp/longjump available Nor on OsX, so thanks - that'll do nicely :) |
The security implications of this are interesting. Wonder how hard it will be to leverage a permissions issue or vuln to make it run bad things in root context. |
@sempervictus there was a security related discussion in the original OpenZFS PR at (openzfs/openzfs#198) |
@don-brady: thank you sir. That thread indicates a reliance on the illumos permissions structure which IIUC has been altered for Linux. The "run as root in an autonomous context" notion of security is a bit concerning since the means of protection - reliance on root CTX, is the very privesc vector we would look to utilize on engagement. |
FYI -- I re-based with latest master and squashed after addressing all the known build bot issues |
module/lua/setjmp/setjmp_arm64.S
Outdated
@@ -0,0 +1,86 @@ | |||
/*- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be named setjmp_aarch64.S
to match the target arch.
Fails to compile on aarch64 as follows:
zfs/module/lua/setjmp/setjmp_aarch64.S: Assembler messages:
zfs/module/lua/setjmp/setjmp_aarch64.S:57: Error: operand 2 should be an integer register -- `stp x29,lr,[x0],#16'
zfs/module/lua/setjmp/setjmp_aarch64.S:75: Error: operand 2 should be an integer register -- `ldp x29,lr,[x0],#16'
scripts/Makefile.build:344: recipe for target 'zfs/module/lua/setjmp/setjmp_aarch64.o' failed
make[5]: *** [zfs/module/lua/setjmp/setjmp_aarch64.o] Error 1
scripts/Makefile.build:455: recipe for target 'zfs/module/lua' failed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. renamed file. The link register is just register 30 so replace lr with x30 fixes the compile
@@ -0,0 +1,65 @@ | |||
/*- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fails to compile as follows:
/home/ubuntu/zfs/module/lua/setjmp/setjmp_arm.S: Assembler messages:
/home/ubuntu/zfs/module/lua/setjmp/setjmp_arm.S:52: Error: bad instruction `ret'
/home/ubuntu/zfs/module/lua/setjmp/setjmp_arm.S:58: Error: bad instruction `ret'
/tmp/ccvXZLXQ.s: Error: .size expression for setjmp does not evaluate to a constant
/tmp/ccvXZLXQ.s: Error: .size expression for longjmp does not evaluate to a constant
scripts/Makefile.build:294: recipe for target '/home/ubuntu/zfs/module/lua/setjmp/setjmp_arm.o' failed
make[3]: *** [/home/ubuntu/zfs/module/lua/setjmp/setjmp_arm.o] Error 1
scripts/Makefile.build:403: recipe for target '/home/ubuntu/zfs/module/lua' failed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep looks like the macro for ret was missing. Fixed
module/lua/Makefile.in
Outdated
|
||
MODULE := zlua | ||
|
||
TARGET_ASM_DIR = @TARGET_ASM_DIR@ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TARGET_ASM_DIR
is defined in always-arch.m4
with the ZFS_AC_CONFIG_ALWAYS_ARCH
macro. However, it is currently only defined for i386|x86_64
it will need to be extends for the other architectures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will try and fix this inside lua section.
@glaubitz @gordan-bobic @wzssyqa @xnox @SIN3R6Y @lundman @jmar83 @harsha544 @loli10K I'd like to ask for your help. You've all contributed to getting ZFS working on non-x86 architectures and we want to make sure we don't regress. This PR depends on new architecture specific assembly and we don't have access to every architecture needed for testing. If you have access to a sparc64, arm, aarch64, ppc, ppc64, ppc64le, mips or s390 system and are willing to help build and test this PR we'd appreciate it! The PR already passes all the test cases for x86 and x86_64 systems and has been merged upstream to OpenZFS so we're primarily concerned with uncovering any architecture specific issues. Thank you in advance! Testing instructions:
Test Results by Architecture
[edit] Table added to track test results as we get them. |
I will run the tests on alpha, hppa, m68k, powerpc, ppc64, sh4, sparc64 and x32. Will report back. |
@glaubitz thank you. For the moment feel free to skip alpha, hppa, m68k, and sh4 none of these have ever been tested to my knowledge so it's likely master won't compile either. |
@behlendorf i'm getting this errors on my RaspberryPi compiling ZFS:
machine info:
I'm going to try cross-compiling later (that's what i usually do for my slower arm boards anyway) and on a vanilla armv7l debian7. |
@don-brady the test cases pass for me on arm and aarch64 with one small fix. The incorrect type used to store the diff --git a/cmd/zfs/zfs_main.c b/cmd/zfs/zfs_main.c
index f0047c9..c353d96 100644
--- a/cmd/zfs/zfs_main.c
+++ b/cmd/zfs/zfs_main.c
@@ -7101,8 +7101,7 @@ usage:
static int
zfs_do_channel_program(int argc, char **argv)
{
- int ret, fd;
- char c;
+ int ret, fd, c;
char *progbuf, *filename, *poolname;
size_t progsize, progread;
nvlist_t *outnvl;
@@ -7111,8 +7110,7 @@ zfs_do_channel_program(int argc, char **argv)
zpool_handle_t *zhp;
/* check options */
- while (-1 !=
- (c = getopt(argc, argv, "t:m:"))) {
+ while ((c = getopt(argc, argv, "t:m:")) != -1) {
switch (c) {
case 't':
case 'm': { Machines:
@loli10K I didn't run in to any build issue on my RaspberryPi. It's running Ubuntu and not Raspbian though so you may have hit a distribution specific issue. |
@behlendorf AFAIK "/usr/include/ctype.h" from https://packages.ubuntu.com/xenial/armhf/libc6-dev/filelist defines both One thing to note that may be relevant: i build with custom EDIT: i think it is relevant, i can reproduce the same issue on amd64 debian8 with
|
Sorry for not getting back earlier. I have now finally found the time to perform the testing. I have access to all the architecture types shown above and I would like to start with sparc64. Once I know how to perform the tests, I will follow up with the remaining architectures. One question: What is the best way to checkout this particular PR? I have now just checked out spl and zfs from master on my sparc64 machine and built both. Shall I just pull the latest version from |
Horum, I see that s390x implementation for setjmp.S is missing =( I don't think I can quite write that from scratch / clean-room. Should I look into cherrypicking *BSD implementation of it? |
@glaubitz thank you in advance! The test procedure is described in this #6558 (comment) above, I've updated it to include additional instructions for checking out @don-brady's zfs-channel-programs branch. Just take note of the commit hash you're testing with and include that with your results. @xnon we can use the *BSD version for s390, several of the other implementations were already brought over from FreeBSD and Illumos as mentioned in the top comment. They both fall under a compatible license. |
@behlendorf Sorry to step in late here. I tested the above mentioned test-case on ppc64le arch with Ubuntu-16.04.2 as OS . Here's the O/P for the test run Running Time: 00:01:09 The 2 failing testcases are below :- Per logs the reason for failure for both of the testcases is : |
merged to latest master to enable code coverage testing |
@harsha544 thank you. It sounds like everything basically worked but there's a a small flaw in the test cases around cleanup up the test pools. |
module/zfs/zfs_ioctl.c
Outdated
@@ -6317,6 +6359,11 @@ zfs_ioctl_init(void) | |||
zfs_ioc_pool_sync, zfs_secpolicy_none, POOL_NAME, | |||
POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY, B_FALSE, B_FALSE); | |||
|
|||
zfs_ioctl_register("channel_program", ZFS_IOC_CHANNEL_PROGRAM, | |||
zfs_ioc_channel_program, zfs_secpolicy_config, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be paranoid but perhaps it makes sense to also make sure we have the CAP_SYS_MODULE capacity, instead of just CAP_SYS_ADMIN which is what the zfs_secpolicy_config() checks for.
Might be overkill but reasoning about the security of this becomes easier: if we have CAP_SYS_MODULE we can change the kernel in arbitrary ways anyway so a restricted Lua script is not a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Paranoid is good, but I'm not sure CAP_SYS_MODULE
is appropriate. Configuring the network stack in Linux also depends on an in-kernel language and they previously used CAP_SYS_MODULE
for the reasons you mentioned before switching to CAP_NET_ADMIN
. Today CAP_SYS_MODULE
only covers loading/unloading kmods. Adding CAP_NET_ADMIN
sounds like a good idea, even if the name is misleading.
This really smells like a bad idea. I get that Illumos and FreeBSD have it, but I'm concerned of the implications of allowing Lua script/programs to run amok in the Linux kernel. I'd like to see a build-time flag to completely disable this component from being compiled in and supported (I didn't see one, but maybe I missed something...). |
@Conan-Kudo I share your concerns, I had the same initial misgivings when this was first proposed. But in reality the kernel already depends on multiple internal virtual machines for several critical subsystems. For example, the network subsystem depends on BPF to flexibly configure interfaces and do efficient packet filtering. Heck, this isn't even the first Lua interpreter which has been added to the kernel since it used by ktap. Now we definitely do need to get this right. Which is why we're using the canonical Lua.org implementation and have added test coverage of the interpreter itself. But this really isn't new territory for a kernel anymore and it's been proven to be a good solution. As for adding a configure option to disable it, that's something worth considering. However, it would result in certain features being total non-functional. |
I think that's sort of the point, and I would be fine with this. But if this is unacceptable, some kind of runtime guard (option flag required to enable it when loading the kmod or something) to ensure it's not something that can cause more fundamental problems would be fine as well. People who would want to use it need to explicitly activate the feature (either at build or module load time), or it is dead code. |
Ok, I finally managed to start testing. It actually fails to build on sparc64, even on master:
|
On ppc64 (POWER5, Big-Endian), configuring of spl fails with:
This problem does not show on PowerPC (POWER4, Big-Endian, 32-Bit). |
It fails on PowerPC as well, the same way as sparc64:
Now, when looking at it, this is because we're building with "-Werror" by default. Should I turn "-Werror" off? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@don-brady thanks for refreshing this and all the work to get it ship shape. This PR looks ready to merge.
The sparc64 implementation has been updated to use the freebsd version of longjmp and all other supported architectures have been manually tested at least once. @glaubitz if there are still issues observed on sparc64 I'll take you up on your offer and get them sorted out. Thank you everyone for the help!
@behlendorf I can re-test a last time before you merge the PR if you want. |
@glaubitz that'd be great. |
@behlendorf Ok, working on it now. |
Building with CC vdev_raidz_math.lo
../../module/zfs/vdev_raidz_math.c: In function ‘vdev_raidz_math_get_ops’:
../../module/zfs/vdev_raidz_math.c:135:24: error: array subscript is above array bounds [-Werror=array-bounds]
ops = raidz_supp_impl[impl];
~~~~~~~~~~~~~~~^~~~~~
cc1: all warnings being treated as errors
Makefile:999: recipe for target 'vdev_raidz_math.lo' failed with gcc-7. |
The testsuite won't run. Installed all the utils available in Debian though:
|
@glaubitz sorry about that. Bad news, due to commit 9a810ef which this was rebased on the original Good news, because of that commit a custom run file is no longer needed and a tag can be used to specify the tests which should be run. Can you try running the tests again like this:
The |
Doesn't look too good, unfortunately:
I suggest someone looking at the code should tackle the unaligned access first. |
@don-brady when you get a chance could you please rebase this on master. |
Codecov Report
@@ Coverage Diff @@
## master #6558 +/- ##
==========================================
+ Coverage 75.5% 76.17% +0.67%
==========================================
Files 296 324 +28
Lines 95908 102928 +7020
==========================================
+ Hits 72411 78410 +5999
- Misses 23497 24518 +1021
Continue to review full report at Codecov.
|
Authored by: Chris Williamson <chris.williamson@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: John Kennedy <john.kennedy@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Approved by: Garrett D'Amore <garrett@damore.org> Ported-by: Don Brady <don.brady@delphix.com> Ported-by: John Kennedy <john.kennedy@delphix.com> OpenZFS-issue: https://www.illumos.org/issues/7431 OpenZFS-commit: openzfs/openzfs@dfc11533 Porting Notes: * The CLI long option arguments for '-t' and '-m' don't parse on linux * Switched from kmem_alloc to vmem_alloc in zcp_lua_alloc * Lua implementation is built as its own module (zlua.ko) * Lua headers consumed directly by zfs code moved to 'include/sys/lua/' * There is no native setjmp/longjump available in stock Linux kernel. Brought over implementations from illumos and FreeBSD * The get_temporary_prop() was adapted due to VFS platform differences * Use of inline functions in lua parser to reduce stack usage per C call * Skip some ZFS Test Suite ZCP tests on sparc64 to avoid stack overflow
Authored by: Chris Williamson <chris.williamson@delphix.com> Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Ported-by: Don Brady <don.brady@delphix.com> zfs.exists() in channel programs doesn't return any result, and should have a man page entry. This patch corrects zfs.exists so that it returns a value indicating if the dataset exists or not. It also adds documentation about it in the man page. OpenZFS-issue: https://www.illumos.org/issues/8605 OpenZFS-commit: openzfs/openzfs@1e85e111
Authored by: Brad Lewis <brad.lewis@delphix.com> Reviewed by: Chris Williamson <chris.williamson@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Ported-by: Don Brady <don.brady@delphix.com> ZFS channel programs should be able to perform a rollback. OpenZFS-issue: https://www.illumos.org/issues/8592 OpenZFS-commit: openzfs/openzfs@d46b5ed6
Authored by: Chris Williamson <chris.williamson@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: John Kennedy <john.kennedy@delphix.com> Reviewed by: Brad Lewis <brad.lewis@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Ported-by: Don Brady <don.brady@delphix.com> ZFS channel programs should be able to create snapshots. In addition to the base snapshot functionality, this entails extra logic to handle edge cases which were formerly not possible, such as creating then destroying a snapshot in the same transaction sync. OpenZFS-issue: https://www.illumos.org/issues/8600 OpenZFS-commit: openzfs/openzfs@68089b8b
Signed-off-by: Don Brady <don.brady@delphix.com>
Add test coverage for lua libraries Remove dead code in Lua implementation Signed-off-by: Don Brady <don.brady@delphix.com>
Authored by: Serapheim Dimitropoulos <serapheim@delphix.com> Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Andy Stormont <astormont@racktopsystems.com> Approved by: Robert Mustacchi <rm@joyent.com> Ported-by: Don Brady <don.brady@delphix.com> Every time we want to unmount a snapshot (happens during snapshot deletion or renaming) we unnecessarily iterate through all the mountpoints in the VFS layer (see zfs_get_vfs). The current patch completely gets rid of that code and changes the approach while keeping the behavior of that code path the same. Specifically, it puts a hold on the dataset/snapshot and gets its vfs resource reference directly, instead of linearly searching for it. If that reference exists we attempt to amount it. With the above change, it became obvious that the nvlist manipulations that we do (add_boolean and add_nvlist) take a significant amount of time ensuring uniqueness of every new element even though they don't have too. Thus, we updated the patch so those nvlists are not trying to enforce the uniqueness of their elements. A more complete analysis of the problem solved by this patch can be found below: https://sdimitro.github.io/post/snap-unmount-perf/ OpenZFS-issue: https://www.illumos.org/issues/8604 OpenZFS-commit: openzfs/openzfs@126118fb
Authored by: Serapheim Dimitropoulos <serapheim@delphix.com> Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: Chris Williamson <chris.williamson@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Ported-by: Don Brady <don.brady@delphix.com> We want to be able to run channel programs outside of synching context. This would greatly improve performance for channel programs that just gather information, as they won't have to wait for synching context anymore. === What is implemented? This feature introduces the following: - A new command line flag in "zfs program" to specify our intention to run in open context. (The -n option) - A new flag/option within the channel program ioctl which selects the context. - Appropriate error handling whenever we try a channel program in open-context that contains zfs.sync* expressions. - Documentation for the new feature in the manual pages. === How do we handle zfs.sync functions in open context? When such a function is found by the interpreter and we are running in open context we abort the script and we spit out a descriptive runtime error. For example, given the script below ... arg = ... fs = arg["argv"][1] err = zfs.sync.destroy(fs) msg = "destroying " .. fs .. " err=" .. err return msg if we run it in open context, we will get back the following error: Channel program execution failed: [string "channel program"]:3: running functions from the zfs.sync submodule requires passing sync=TRUE to lzc_channel_program() (i.e. do not specify the "-n" command line argument) stack traceback: [C]: in function 'destroy' [string "channel program"]:3: in main chunk === What about testing? We've introduced new wrappers for all channel program tests that run each channel program as both (startard & open-context) and expect the appropriate behavior depending on the program using the zfs.sync module. OpenZFS-issue: https://www.illumos.org/issues/8677 OpenZFS-commit: openzfs/openzfs@17a49e15
Authored by: Serapheim Dimitropoulos <serapheim@delphix.com> Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: Chris Williamson <chris.williamson@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Ported-by: Don Brady <don.brady@delphix.com> We want to be able to run channel programs outside of synching context. This would greatly improve performance for channel programs that just gather information, as they won't have to wait for synching context anymore. === What is implemented? This feature introduces the following: - A new command line flag in "zfs program" to specify our intention to run in open context. (The -n option) - A new flag/option within the channel program ioctl which selects the context. - Appropriate error handling whenever we try a channel program in open-context that contains zfs.sync* expressions. - Documentation for the new feature in the manual pages. === How do we handle zfs.sync functions in open context? When such a function is found by the interpreter and we are running in open context we abort the script and we spit out a descriptive runtime error. For example, given the script below ... arg = ... fs = arg["argv"][1] err = zfs.sync.destroy(fs) msg = "destroying " .. fs .. " err=" .. err return msg if we run it in open context, we will get back the following error: Channel program execution failed: [string "channel program"]:3: running functions from the zfs.sync submodule requires passing sync=TRUE to lzc_channel_program() (i.e. do not specify the "-n" command line argument) stack traceback: [C]: in function 'destroy' [string "channel program"]:3: in main chunk === What about testing? We've introduced new wrappers for all channel program tests that run each channel program as both (startard & open-context) and expect the appropriate behavior depending on the program using the zfs.sync module. OpenZFS-issue: https://www.illumos.org/issues/8677 OpenZFS-commit: openzfs/openzfs@17a49e15 Closes openzfs#6558
On systems with CONFIG_SMP turned off, spin_is_locked always returns false causing these assertions to fail. Remove them as suggested in openzfs/zfs#6558. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: James Cowgill <james.cowgill@mips.com> Closes openzfs#665
The changes piggyback JSON output support on top of channel programs (#6558). This way the JSON output support is targeted to scripting use cases and is easily maintainable since it really only touches one function (zfs_do_channel_program()). This patch ports Joyent's JSON nvlist library from illumos to enable easy JSON printing of channel program output nvlist. To keep the delta small I also took advantage of the fact that printing in zfs_do_channel_program() was almost always done before exiting the program. Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alek Pinchuk <apinchuk@datto.com> Closes #7281
On systems with CONFIG_SMP turned off, spin_is_locked always returns false causing these assertions to fail. Remove them as suggested in openzfs/zfs#6558. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: James Cowgill <james.cowgill@mips.com> Closes #665
Authored by: Chris Williamson chris.williamson@delphix.com
Reviewed by: Matthew Ahrens mahrens@delphix.com
Reviewed by: George Wilson george.wilson@delphix.com
Reviewed by: John Kennedy john.kennedy@delphix.com
Reviewed by: Dan Kimmel dan.kimmel@delphix.com
Approved by: Garrett D'Amore garrett@damore.org
Ported-by: Don Brady don.brady@delphix.com
Ported-by: John Kennedy john.kennedy@delphix.com
OpenZFS-issue: https://www.illumos.org/issues/7431
OpenZFS-commit: openzfs/openzfs@dfc11533
Porting Notes:
How Has This Been Tested?
Manual tests
Automated tests
Types of changes
Checklist:
Signed-off-by
.