Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenZFS 7431 - ZFS Channel Programs #6558

Closed
wants to merge 8 commits into from

Conversation

Projects
None yet
@don-brady
Copy link
Member

commented Aug 25, 2017

Authored by: Chris Williamson chris.williamson@delphix.com
Reviewed by: Matthew Ahrens mahrens@delphix.com
Reviewed by: George Wilson george.wilson@delphix.com
Reviewed by: John Kennedy john.kennedy@delphix.com
Reviewed by: Dan Kimmel dan.kimmel@delphix.com
Approved by: Garrett D'Amore garrett@damore.org
Ported-by: Don Brady don.brady@delphix.com
Ported-by: John Kennedy john.kennedy@delphix.com

OpenZFS-issue: https://www.illumos.org/issues/7431
OpenZFS-commit: openzfs/openzfs@dfc1153

Porting Notes:

  • The CLI long option arguments for '-t' and '-m' don't parse on linux
  • Switched from kmem_alloc to vmem_alloc in zcp_lua_alloc
  • Lua implementation is built as its own module (zlua.ko)
  • Lua headers consumed directly by zfs code moved to 'include/sys/lua/'
  • There is no native setjmp/longjump available in stock Linux kernel. Brought over implementation from illumos and FreeBSD
  • The get_temporary_prop() was adapted due to VFS platform differences
  • Use of in-lining functions in lua parser code to reduce stack usage per nested C call

How Has This Been Tested?

Manual tests

  • running basic get-props channel programs from CLI
  • exercised the zfs property get CLI with the envr ZFS_PROP_DEBUG=1 set

Automated tests

  • ztest runs that exercise the new ZCP destroy snapshots path
  • new ZTS channel_program functional tests

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

  • My code follows the ZFS on Linux code style requirements.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.
  • All commit messages are properly formatted and contain Signed-off-by.
  • Change has been approved by a ZFS on Linux member.

@behlendorf behlendorf added this to the 0.8.0 milestone Aug 25, 2017

@lundman

This comment has been minimized.

Copy link
Contributor

commented Aug 25, 2017

There is no native setjmp/longjump available

Nor on OsX, so thanks - that'll do nicely :)

@dinatale2 dinatale2 self-requested a review Aug 25, 2017

@sempervictus

This comment has been minimized.

Copy link
Contributor

commented Aug 25, 2017

The security implications of this are interesting. Wonder how hard it will be to leverage a permissions issue or vuln to make it run bad things in root context.
Will read over it shortly, but can anyone summarize the threat/defense model used in review?

@don-brady

This comment has been minimized.

Copy link
Member Author

commented Aug 25, 2017

@sempervictus there was a security related discussion in the original OpenZFS PR at (openzfs/openzfs#198)

@sempervictus

This comment has been minimized.

Copy link
Contributor

commented Aug 25, 2017

@don-brady: thank you sir. That thread indicates a reliance on the illumos permissions structure which IIUC has been altered for Linux. The "run as root in an autonomous context" notion of security is a bit concerning since the means of protection - reliance on root CTX, is the very privesc vector we would look to utilize on engagement.
I think it would make sense to brainstorm the potential set of surfaces this exposes (MITRE's AT&K is a good start), and develop validations and mitigations to be applied at build time to ensure future commits don't open a root hole in our systems. Few things slow implementation in real world environments than some lwn/msnbc article about how evil cddl binaries subvert security in the "oh so secure" GPL side of the house. :-)

@don-brady don-brady force-pushed the don-brady:zfs-channel-programs branch from 6d4f92c to db2c908 Aug 30, 2017

@don-brady

This comment has been minimized.

Copy link
Member Author

commented Aug 30, 2017

FYI -- I re-based with latest master and squashed after addressing all the known build bot issues

@don-brady don-brady force-pushed the don-brady:zfs-channel-programs branch from db2c908 to 9730797 Aug 31, 2017

@@ -0,0 +1,86 @@
/*-

This comment has been minimized.

Copy link
@behlendorf

behlendorf Sep 2, 2017

Member

Should be named setjmp_aarch64.S to match the target arch.

Fails to compile on aarch64 as follows:

zfs/module/lua/setjmp/setjmp_aarch64.S: Assembler messages:
zfs/module/lua/setjmp/setjmp_aarch64.S:57: Error: operand 2 should be an integer register -- `stp x29,lr,[x0],#16'
zfs/module/lua/setjmp/setjmp_aarch64.S:75: Error: operand 2 should be an integer register -- `ldp x29,lr,[x0],#16'
scripts/Makefile.build:344: recipe for target 'zfs/module/lua/setjmp/setjmp_aarch64.o' failed
make[5]: *** [zfs/module/lua/setjmp/setjmp_aarch64.o] Error 1
scripts/Makefile.build:455: recipe for target 'zfs/module/lua' failed

This comment has been minimized.

Copy link
@don-brady

don-brady Sep 5, 2017

Author Member

Yep. renamed file. The link register is just register 30 so replace lr with x30 fixes the compile

@@ -0,0 +1,65 @@
/*-

This comment has been minimized.

Copy link
@behlendorf

behlendorf Sep 2, 2017

Member

Fails to compile as follows:

/home/ubuntu/zfs/module/lua/setjmp/setjmp_arm.S: Assembler messages:
/home/ubuntu/zfs/module/lua/setjmp/setjmp_arm.S:52: Error: bad instruction `ret'
/home/ubuntu/zfs/module/lua/setjmp/setjmp_arm.S:58: Error: bad instruction `ret'
/tmp/ccvXZLXQ.s: Error: .size expression for setjmp does not evaluate to a constant
/tmp/ccvXZLXQ.s: Error: .size expression for longjmp does not evaluate to a constant
scripts/Makefile.build:294: recipe for target '/home/ubuntu/zfs/module/lua/setjmp/setjmp_arm.o' failed
make[3]: *** [/home/ubuntu/zfs/module/lua/setjmp/setjmp_arm.o] Error 1
scripts/Makefile.build:403: recipe for target '/home/ubuntu/zfs/module/lua' failed

This comment has been minimized.

Copy link
@don-brady

don-brady Sep 5, 2017

Author Member

yep looks like the macro for ret was missing. Fixed


MODULE := zlua

TARGET_ASM_DIR = @TARGET_ASM_DIR@

This comment has been minimized.

Copy link
@behlendorf

behlendorf Sep 2, 2017

Member

TARGET_ASM_DIR is defined in always-arch.m4 with the ZFS_AC_CONFIG_ALWAYS_ARCH macro. However, it is currently only defined for i386|x86_64 it will need to be extends for the other architectures.

This comment has been minimized.

Copy link
@don-brady

don-brady Sep 5, 2017

Author Member

Will try and fix this inside lua section.

@behlendorf

This comment has been minimized.

Copy link
Member

commented Sep 7, 2017

@glaubitz @gordan-bobic @wzssyqa @xnox @SIN3R6Y @lundman @jmar83 @harsha544 @loli10K I'd like to ask for your help. You've all contributed to getting ZFS working on non-x86 architectures and we want to make sure we don't regress.

This PR depends on new architecture specific assembly and we don't have access to every architecture needed for testing. If you have access to a sparc64, arm, aarch64, ppc, ppc64, ppc64le, mips or s390 system and are willing to help build and test this PR we'd appreciate it! The PR already passes all the test cases for x86 and x86_64 systems and has been merged upstream to OpenZFS so we're primarily concerned with uncovering any architecture specific issues.

Thank you in advance!

Testing instructions:

  1. You can do all the needed testing in-tree so this shouldn't be too disruptive. You'll first need to clone the zfs repository and checkout this PR, there are directions for developing in-tree posted on the wiki, but I'll summarize them here.
$ git clone https://github.com/zfsonlinux/spl.git
$ cd spl
$ sh autogen
$ ./configure --enable-debug
$ make -s -j$(nproc)
$ cd ..

$ git clone https://github.com/zfsonlinux/zfs.git
$ cd zfs
$ git fetch origin pull/6558/head:zfs-channel-programs
$ git checkout zfs-channel-programs
$ sh autogen
$ ./configure --enable-debug
$ make -s -j$(nproc)
  1. Assuming you don't run in to any issues compiling you can use the scripts/zfs.sh script to load the new modules on the system. You'll also need to run sudo ./scripts/zfs-helpers.sh -iv to create symlinks on the system to a few helpers programs in the zfs source tree. You can run ./scripts/zfs-helpers.sh -rv to remove the links after testing.
$ sudo ./scripts/zfs.sh
$ sudo ./scripts/zfs-helpers.sh -iv
  1. We're primarily interested in running the ZFS Channel Program test cases so my recommendation is to create a custom runfile for the ZFS Test Suite called tests/runfiles/zcp.run which only includes the new tests and looks like this. If you'd like to run the entire test suite, which could take many hours, you can using the existing linux.run file.
$ cat <<EOF >> tests/runfiles/zcp.run
[DEFAULT]
pre = setup
quiet = False
pre_user = root
user = root
timeout = 600
post_user = root
post = cleanup
outputdir = /var/tmp/test_results

[tests/functional/channel_program/lua_core]
tests = ['tst.args_to_lua', 'tst.divide_by_zero', 'tst.integer_illegal',
    'tst.integer_overflow', 'tst.language_functions_neg',
    'tst.language_functions_pos', 'tst.large_prog', 'tst.memory_limit',
    'tst.nested_neg', 'tst.nested_pos', 'tst.nvlist_to_lua',
    'tst.recursive_neg', 'tst.recursive_pos', 'tst.return_nvlist_neg',
    'tst.return_nvlist_pos', 'tst.return_recursive_table', 'tst.timeout']

[tests/functional/channel_program/synctask_core]
tests = ['tst.destroy_fs', 'tst.destroy_snap', 'tst.get_count_and_limit',
    'tst.get_index_props', 'tst.get_mountpoint', 'tst.get_neg',
    'tst.get_number_props', 'tst.get_string_props', 'tst.get_type',
    'tst.get_userquota', 'tst.get_written', 'tst.list_children',
    'tst.list_clones', 'tst.list_snapshots', 'tst.list_system_props',
    'tst.parse_args_neg','tst.promote_conflict', 'tst.promote_multiple',
    'tst.promote_simple']
EOF
  1. Finally you can run the ZFS Test Suite with the following command from the top-level zfs directory, scripts/zfs-tests.sh -v -r zcp. Assuming all the tests pass you'll see something like this. If they fail please provide any debugging you can to help us debug it. Also make sure you include the version tested, it can be read from the /sys/module/zfs/version file. Thanks you again for any help you can offer.
$ cat /sys/module/zfs/version
$ ./scripts/zfs-tests.sh -v -r zcp
...

Results Summary
SKIP	  0
PASS	 40

Running Time:	00:05:05
Percent passed:	100.0%
Log directory:	/var/tmp/test_results/20170907T013916
  1. Optional cleanup
$ sudo ./scripts/zfs-helpers -rv
$ sudo ./scripts/zfs.sh -u
$ cd ..
$ rm -Rf spl zfs

Test Results by Architecture

Arch Result Tested-by Distribution Commit
x86 PASS buildbot zfs-0.7.0-61-gd6484a9
x86_64 PASS buildbot zfs-0.7.0-61-gd6484a9
sparc64 PASS @don-brady Debian-9
armv7 PASS @behlendorf Ubuntu 16.04.3 LTS zfs-0.7.0-61-gd6484a9
aarch64 PASS @behlendorf Ubuntu 16.04.3 LTS zfs-0.7.0-61-gd6484a9
ppc PASS @glaubitz Debian zfs-0.7.0-108-gc8da8a3
ppc64 PASS @glaubitz OpenSUSE Tumbleweed (20170924) 0.7.0-135_g3f51f0b78
ppc64le PASS (mostly) @harsha544 Ubuntu-16.04.2 LTS zfs-0.7.0-61-gd6484a9
mips64 PASS (with) @jcowgill zfs-0.7.0-108-gc8da8a3
s390

[edit] Table added to track test results as we get them.

@glaubitz

This comment has been minimized.

Copy link
Contributor

commented Sep 7, 2017

I will run the tests on alpha, hppa, m68k, powerpc, ppc64, sh4, sparc64 and x32. Will report back.

@behlendorf

This comment has been minimized.

Copy link
Member

commented Sep 8, 2017

@glaubitz thank you. For the moment feel free to skip alpha, hppa, m68k, and sh4 none of these have ever been tested to my knowledge so it's likely master won't compile either.

@loli10K

This comment has been minimized.

Copy link
Contributor

commented Sep 8, 2017

@behlendorf i'm getting this errors on my RaspberryPi compiling ZFS:

  CC       lstate.lo
  CC       lstring.lo
  CC       lstrlib.lo
../../module/lua/lstrlib.c:35:0: error: "iscntrl" redefined [-Werror]
 #define iscntrl(C) ((((C) >= 0) && ((C) <= 0x1f)) || ((C) == 0x7f))
 ^
In file included from ../../include/sys/zfs_context.h:105:0,
                 from ../../include/sys/lua/lua.h:12,
                 from ../../module/lua/lstrlib.c:11:
/usr/include/ctype.h:199:0: note: this is the location of the previous definition
 # define iscntrl(c) __isctype((c), _IScntrl)
 ^
../../module/lua/lstrlib.c:36:0: error: "isgraph" redefined [-Werror]
 #define isgraph(C) ((C) >= 0x21 && (C) <= 0x7E)
 ^
In file included from ../../include/sys/zfs_context.h:105:0,
                 from ../../include/sys/lua/lua.h:12,
                 from ../../module/lua/lstrlib.c:11:
/usr/include/ctype.h:202:0: note: this is the location of the previous definition
 # define isgraph(c) __isctype((c), _ISgraph)
 ^
../../module/lua/lstrlib.c:37:0: error: "ispunct" redefined [-Werror]
 #define ispunct(C) (((C) >= 0x21 && (C) <= 0x2F) || \
 ^
In file included from ../../include/sys/zfs_context.h:105:0,
                 from ../../include/sys/lua/lua.h:12,
                 from ../../module/lua/lstrlib.c:11:
/usr/include/ctype.h:204:0: note: this is the location of the previous definition
 # define ispunct(c) __isctype((c), _ISpunct)
 ^
cc1: all warnings being treated as errors
Makefile:980: recipe for target 'lstrlib.lo' failed
make[3]: *** [lstrlib.lo] Error 1
make[3]: Leaving directory '/usr/src/zfs/lib/libzpool'
Makefile:538: recipe for target 'all-recursive' failed
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory '/usr/src/zfs/lib'
Makefile:707: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/usr/src/zfs'
Makefile:577: recipe for target 'all' failed
make: *** [all] Error 2

real    81m7.259s
user    65m18.480s
sys     8m13.880s
root@raspberrypi:/usr/src/zfs# 

machine info:

root@raspberrypi:~# uname -m                                                                                                                                                                                       
armv6l
root@raspberrypi:~# lsb_release -a                                                                                                                                                                                 
No LSB modules are available.
Distributor ID: Raspbian
Description:    Raspbian GNU/Linux 8.0 (jessie)
Release:        8.0
Codename:       jessie
root@raspberrypi:~# dpkg-query -W -f='${binary:Package}:${Version}\n' libc6-dev
libc6-dev:armhf:2.19-18+deb8u10
root@raspberrypi:~# 

I'm going to try cross-compiling later (that's what i usually do for my slower arm boards anyway) and on a vanilla armv7l debian7.

@behlendorf

This comment has been minimized.

Copy link
Member

commented Sep 8, 2017

@don-brady the test cases pass for me on arm and aarch64 with one small fix. The incorrect type used to store the getopt() return value resulted in it spinning in the while loop. The second hunk was just to make this code consistent with everything else and readable.

diff --git a/cmd/zfs/zfs_main.c b/cmd/zfs/zfs_main.c
index f0047c9..c353d96 100644
--- a/cmd/zfs/zfs_main.c
+++ b/cmd/zfs/zfs_main.c
@@ -7101,8 +7101,7 @@ usage:
 static int
 zfs_do_channel_program(int argc, char **argv)
 {
-       int ret, fd;
-       char c;
+       int ret, fd, c;
        char *progbuf, *filename, *poolname;
        size_t progsize, progread;
        nvlist_t *outnvl;
@@ -7111,8 +7110,7 @@ zfs_do_channel_program(int argc, char **argv)
        zpool_handle_t *zhp;
 
        /* check options */
-       while (-1 !=
-           (c = getopt(argc, argv, "t:m:"))) {
+       while ((c = getopt(argc, argv, "t:m:")) != -1) {
                switch (c) {
                case 't':
                case 'm': {

Machines:

odroid@odroid64:~/src/git/zfs$ uname -m
aarch64
odroid@odroid64:~/src/git/zfs$ lsb_release -a        
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 16.04.3 LTS
Release:	16.04
Codename:	xenial
ubuntu@ubuntu:~/zfs$ uname -m
armv7l
ubuntu@ubuntu:~/zfs$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 16.04.3 LTS
Release:	16.04
Codename:	xenial

@loli10K I didn't run in to any build issue on my RaspberryPi. It's running Ubuntu and not Raspbian though so you may have hit a distribution specific issue.

@loli10K

This comment has been minimized.

Copy link
Contributor

commented Sep 8, 2017

@behlendorf AFAIK "/usr/include/ctype.h" from https://packages.ubuntu.com/xenial/armhf/libc6-dev/filelist defines both iscntrl and tolower (the latter is used to #ifndef these definitions).

One thing to note that may be relevant: i build with custom CFLAGS='-O0 -g -Wno-error=frame-larger-than=', maybe something is getting optimized out without these, avoiding these redefinition errors?

EDIT: i think it is relevant, i can reproduce the same issue on amd64 debian8 with CFLAGS='-O0 -Wno-error=frame-larger-than=':

  CC       lstate.lo
  CC       lstring.lo
  CC       lstrlib.lo
  CC       ltable.lo
../../module/lua/lstrlib.c:35:0: error: "iscntrl" redefined [-Werror]
 #define iscntrl(C) ((((C) >= 0) && ((C) <= 0x1f)) || ((C) == 0x7f))
 ^
In file included from ../../include/sys/zfs_context.h:105:0,
                 from ../../include/sys/lua/lua.h:12,
                 from ../../module/lua/lstrlib.c:11:
/usr/include/ctype.h:199:0: note: this is the location of the previous definition
 # define iscntrl(c) __isctype((c), _IScntrl)
 ^
../../module/lua/lstrlib.c:36:0: error: "isgraph" redefined [-Werror]
 #define isgraph(C) ((C) >= 0x21 && (C) <= 0x7E)
 ^
In file included from ../../include/sys/zfs_context.h:105:0,
                 from ../../include/sys/lua/lua.h:12,
                 from ../../module/lua/lstrlib.c:11:
/usr/include/ctype.h:202:0: note: this is the location of the previous definition
 # define isgraph(c) __isctype((c), _ISgraph)
 ^
../../module/lua/lstrlib.c:37:0: error: "ispunct" redefined [-Werror]
 #define ispunct(C) (((C) >= 0x21 && (C) <= 0x2F) || \
 ^
In file included from ../../include/sys/zfs_context.h:105:0,
                 from ../../include/sys/lua/lua.h:12,
                 from ../../module/lua/lstrlib.c:11:
/usr/include/ctype.h:204:0: note: this is the location of the previous definition
 # define ispunct(c) __isctype((c), _ISpunct)
 ^
cc1: all warnings being treated as errors
Makefile:980: recipe for target 'lstrlib.lo' failed
make[3]: *** [lstrlib.lo] Error 1
make[3]: *** Waiting for unfinished jobs....
make[3]: Leaving directory '/usr/src/zfs/lib/libzpool'
Makefile:538: recipe for target 'all-recursive' failed
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory '/usr/src/zfs/lib'
Makefile:707: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/usr/src/zfs'
Makefile:577: recipe for target 'all' failed
make: *** [all] Error 2
root@linux:/usr/src/zfs# uname -m
x86_64
root@linux:/usr/src/zfs# lsb_release -a
No LSB modules are available.
Distributor ID:	Debian
Description:	Debian GNU/Linux 8.0 (jessie)
Release:	8.0
Codename:	jessie
root@linux:/usr/src/zfs# 
@glaubitz

This comment has been minimized.

Copy link
Contributor

commented Sep 30, 2017

Sorry for not getting back earlier. I have now finally found the time to perform the testing. I have access to all the architecture types shown above and I would like to start with sparc64. Once I know how to perform the tests, I will follow up with the remaining architectures.

One question: What is the best way to checkout this particular PR? I have now just checked out spl and zfs from master on my sparc64 machine and built both. Shall I just pull the latest version from don-brady:zfs-channel-programs?

@xnox

This comment has been minimized.

Copy link
Contributor

commented Oct 2, 2017

Horum, I see that s390x implementation for setjmp.S is missing =( I don't think I can quite write that from scratch / clean-room. Should I look into cherrypicking *BSD implementation of it?

@behlendorf

This comment has been minimized.

Copy link
Member

commented Oct 2, 2017

@glaubitz thank you in advance! The test procedure is described in this #6558 (comment) above, I've updated it to include additional instructions for checking out @don-brady's zfs-channel-programs branch. Just take note of the commit hash you're testing with and include that with your results.

@xnon we can use the *BSD version for s390, several of the other implementations were already brought over from FreeBSD and Illumos as mentioned in the top comment. They both fall under a compatible license.

@harsha544

This comment has been minimized.

Copy link

commented Oct 3, 2017

@behlendorf Sorry to step in late here. I tested the above mentioned test-case on ppc64le arch with Ubuntu-16.04.2 as OS . Here's the O/P for the test run
Results Summary
FAIL 2
PASS 38

Running Time: 00:01:09
Percent passed: 95.0%
Log directory: /var/tmp/test_results/20171003T061518

The 2 failing testcases are below :-
Test: /home/harsha/zfs/tests/zfs-tests/tests/functional/channel_program/lua_core/cleanup (run as root) [00:00] [FAIL]
Test: /home/harsha/zfs/tests/zfs-tests/tests/functional/channel_program/synctask_core/cleanup (run as root) [00:00] [FAIL]

Per logs the reason for failure for both of the testcases is :
06:15:33.74 rm: cannot remove '/var/tmp/testdir': Device or resource busy
06:15:33.74 ERROR: rm -rf /var/tmp/testdir exited 1

@don-brady don-brady force-pushed the don-brady:zfs-channel-programs branch from 2748c87 to 8441a70 Oct 3, 2017

@don-brady

This comment has been minimized.

Copy link
Member Author

commented Oct 3, 2017

merged to latest master to enable code coverage testing

@behlendorf

This comment has been minimized.

Copy link
Member

commented Oct 3, 2017

@harsha544 thank you. It sounds like everything basically worked but there's a a small flaw in the test cases around cleanup up the test pools.

@@ -6317,6 +6359,11 @@ zfs_ioctl_init(void)
zfs_ioc_pool_sync, zfs_secpolicy_none, POOL_NAME,
POOL_CHECK_SUSPENDED | POOL_CHECK_READONLY, B_FALSE, B_FALSE);

zfs_ioctl_register("channel_program", ZFS_IOC_CHANNEL_PROGRAM,
zfs_ioc_channel_program, zfs_secpolicy_config,

This comment has been minimized.

Copy link
@alek-p

alek-p Oct 3, 2017

Contributor

I might be paranoid but perhaps it makes sense to also make sure we have the CAP_SYS_MODULE capacity, instead of just CAP_SYS_ADMIN which is what the zfs_secpolicy_config() checks for.
Might be overkill but reasoning about the security of this becomes easier: if we have CAP_SYS_MODULE we can change the kernel in arbitrary ways anyway so a restricted Lua script is not a problem.

This comment has been minimized.

Copy link
@behlendorf

behlendorf Oct 6, 2017

Member

Paranoid is good, but I'm not sure CAP_SYS_MODULE is appropriate. Configuring the network stack in Linux also depends on an in-kernel language and they previously used CAP_SYS_MODULE for the reasons you mentioned before switching to CAP_NET_ADMIN. Today CAP_SYS_MODULE only covers loading/unloading kmods. Adding CAP_NET_ADMIN sounds like a good idea, even if the name is misleading.

@Conan-Kudo

This comment has been minimized.

Copy link
Contributor

commented Oct 5, 2017

This really smells like a bad idea. I get that Illumos and FreeBSD have it, but I'm concerned of the implications of allowing Lua script/programs to run amok in the Linux kernel. I'd like to see a build-time flag to completely disable this component from being compiled in and supported (I didn't see one, but maybe I missed something...).

@behlendorf

This comment has been minimized.

Copy link
Member

commented Oct 6, 2017

@Conan-Kudo I share your concerns, I had the same initial misgivings when this was first proposed. But in reality the kernel already depends on multiple internal virtual machines for several critical subsystems. For example, the network subsystem depends on BPF to flexibly configure interfaces and do efficient packet filtering. Heck, this isn't even the first Lua interpreter which has been added to the kernel since it used by ktap.

Now we definitely do need to get this right. Which is why we're using the canonical Lua.org implementation and have added test coverage of the interpreter itself. But this really isn't new territory for a kernel anymore and it's been proven to be a good solution.

As for adding a configure option to disable it, that's something worth considering. However, it would result in certain features being total non-functional.

@Conan-Kudo

This comment has been minimized.

Copy link
Contributor

commented Oct 6, 2017

As for adding a configure option to disable it, that's something worth considering. However, it would result in certain features being total non-functional.

I think that's sort of the point, and I would be fine with this. But if this is unacceptable, some kind of runtime guard (option flag required to enable it when loading the kmod or something) to ensure it's not something that can cause more fundamental problems would be fine as well. People who would want to use it need to explicitly activate the feature (either at build or module load time), or it is dead code.

@behlendorf

This comment has been minimized.

Copy link
Member

commented Dec 5, 2017

@glaubitz that'd be great.

@glaubitz

This comment has been minimized.

Copy link
Contributor

commented Dec 5, 2017

@behlendorf Ok, working on it now.

@glaubitz

This comment has been minimized.

Copy link
Contributor

commented Dec 5, 2017

Building with -Werror, the default with --enable-debug, gives:

  CC       vdev_raidz_math.lo
../../module/zfs/vdev_raidz_math.c: In function ‘vdev_raidz_math_get_ops’:
../../module/zfs/vdev_raidz_math.c:135:24: error: array subscript is above array bounds [-Werror=array-bounds]
   ops = raidz_supp_impl[impl];
         ~~~~~~~~~~~~~~~^~~~~~
cc1: all warnings being treated as errors
Makefile:999: recipe for target 'vdev_raidz_math.lo' failed

with gcc-7.

@glaubitz

This comment has been minimized.

Copy link
Contributor

commented Dec 5, 2017

The testsuite won't run. Installed all the utils available in Debian though:

glaubitz@testvm:~/zfs/zfs$ cat /sys/module/zfs/version
0.7.0-208_g122b82733
glaubitz@testvm:~/zfs/zfs$ ./scripts/zfs-tests.sh -v -r zcp

--- Configuration ---
Runfile:         /home/glaubitz/zfs/zfs/tests/runfiles/zcp.run
STF_TOOLS:       /home/glaubitz/zfs/zfs/tests/test-runner
STF_SUITE:       /home/glaubitz/zfs/zfs/tests/zfs-tests
STF_PATH:        /home/glaubitz/zfs/zfs/bin
FILEDIR:         /var/tmp
FILES:           /var/tmp/file-vdev0 /var/tmp/file-vdev1 /var/tmp/file-vdev2
LOOPBACKS:       /dev/loop0 /dev/loop1 /dev/loop2 
DISKS:           loop0 loop1 loop2 
NUM_DISKS:       3
FILESIZE:        4G
ITERATIONS:      1
TAGS:            functional
Keep pool(s):    rpool
Missing util(s): devname2devid umask wait 

/home/glaubitz/zfs/zfs/tests/test-runner/bin/test-runner.py  -c /home/glaubitz/zfs/zfs/tests/runfiles/zcp.run -T functional -i /home/glaubitz/zfs/zfs/tests/zfs-tests -I 1

glaubitz@testvm:~/zfs/zfs$
@behlendorf

This comment has been minimized.

Copy link
Member

commented Dec 6, 2017

@glaubitz sorry about that.

Bad news, due to commit 9a810ef which this was rebased on the original zcp runfile I included above will no longer work. This is likely why the ZCP didn't run for you.

Good news, because of that commit a custom run file is no longer needed and a tag can be used to specify the tests which should be run. Can you try running the tests again like this:

./scripts/zfs-tests.sh -v -T channel_program

The vdev_raidz_math_get_ops warning is unrelated to this PR but something we should fix anyway. I'm not sure why we didn't see it on Fedora 27 with gcc-7.2.1.

@glaubitz

This comment has been minimized.

Copy link
Contributor

commented Dec 7, 2017

Doesn't look too good, unfortunately:

glaubitz@testvm:~/zfs/zfs$ ./scripts/zfs-tests.sh -v -T channel_program

--- Configuration ---
Runfile:         /home/glaubitz/zfs/zfs/tests/runfiles/linux.run
STF_TOOLS:       /home/glaubitz/zfs/zfs/tests/test-runner
STF_SUITE:       /home/glaubitz/zfs/zfs/tests/zfs-tests
STF_PATH:        /home/glaubitz/zfs/zfs/bin
[   97.591830] loop: module loaded
FILEDIR:         /var/tmp
FILES:           /var/tmp/file-vdev0 /var/tmp/file-vdev1 /var/tmp/file-vdev2
LOOPBACKS:       /dev/loop0 /dev/loop1 /dev/loop2 
DISKS:           loop0 loop1 loop2 
NUM_DISKS:       3
FILESIZE:        4G
ITERATIONS:      1
TAGS:            channel_program
Keep pool(s):    rpool
Missing util(s): devname2devid umask wait 

/home/glaubitz/zfs/zfs/tests/test-runner/bin/test-runner.py  -c /home/glaubitz/zfs/zfs/tests/runfiles/linux.run -T channel_program -i /home/glaubitz/zfs/zfs/tests/zfs-tests -I 1
Test: /home/glaubitz/zfs/zfs/tests/zfs-tests/tests/functional/channel_program/lua_core/setup (run as root) [00:01] [PASS]
[   99.595326] Kernel unaligned access at TPC[104cdd20] setjmp+0x0/0x20 [zlua]
[   99.595355] Kernel unaligned access at TPC[104cdd24] setjmp+0x4/0x20 [zlua]
[   99.595732] Kernel unaligned access at TPC[104cdd20] setjmp+0x0/0x20 [zlua]
[   99.595751] Kernel unaligned access at TPC[104cdd24] setjmp+0x4/0x20 [zlua]
[   99.595910] Kernel unaligned access at TPC[104cdd20] setjmp+0x0/0x20 [zlua]
Test: /home/glaubitz/zfs/zfs/tests/zfs-tests/tests/functional/channel_program/lua_core/tst.args_to_lua (run as root) [00:00] [PASS]
[  260.277836] watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [systemd-logind:328]
[  260.277864] Modules linked in: loop zfs(PO) icp(PO) zlua(PO) zcommon(PO) zunicode(PO) znvpair(PO) zavl(PO) splat(O) spl(O) zlib_deflate binfmt_misc camellia_sparc64 des_sparc64 des_generic n2_rng sha512_sparc64 rng_core sha256_sparc64 flash sha1_sparc64 nfsd md5_sparc64 auth_rpcgss nfs_acl lockd grace sunrpc ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic fscrypto ecb crc32c_sparc64 aes_sparc64 sunvnet sunvnet_common sunvdc
[  260.278085] CPU: 1 PID: 328 Comm: systemd-logind Tainted: P           O    4.14.0-1-sparc64-smp #1 Debian 4.14.2-1
[  260.278106] task: fff80000a123c180 task.stack: fff800009eff8000
[  260.278119] TSTATE: 0000000011001600 TPC: 00000000004fb02c TNPC: 00000000004fb030 Y: 00000000    Tainted: P           O   
[  260.278150] TPC: <smp_call_function_single+0xec/0x160>
[  260.278162] g0: 0000000000d8d400 g1: 0000000000000003 g2: 0000000000000800 g3: 0000000000d57000
[  260.278178] g4: fff80000a123c180 g5: fff80000ac2f0000 g6: fff800009eff8000 g7: 0000000000000000
[  260.278193] o0: 0000000000000000 o1: fff800009effb2a0 o2: 000000000043f1c0 o3: fff800009effb448
[  260.278208] o4: 0000000000000000 o5: 0000000000000000 sp: fff800009effa9f1 ret_pc: 00000000004fb00c
[  260.278225] RPC: <smp_call_function_single+0xcc/0x160>
[  260.278236] l0: 0000000000440284 l1: fff800009effab91 l2: 0000000000d8d400 l3: 0000000000b59212
[  260.278252] l4: 0000000000000002 l5: 0000000000000000 l6: 0000000000000000 l7: fff8000100d7e000
[  260.278266] i0: 0000000000000000 i1: 000000000043f1c0 i2: fff800009effb448 i3: 0000000000000001
[  260.278281] i4: 0000000000000000 i5: fff800009effb2a0 i6: fff800009effaae1 i7: 00000000004fb4e4
[  260.278298] I7: <smp_call_function_many+0x304/0x340>
[  260.278309] Call Trace:
[  260.278320]  [00000000004fb4e4] smp_call_function_many+0x304/0x340
[  260.278345]  [0000000000440284] smp_flush_tlb_pending+0x44/0xc0
[  260.278365]  [000000000045774c] flush_tlb_pending+0x6c/0xa0
[  260.278378]  [0000000000457904] arch_leave_lazy_mmu_mode+0x24/0x40
[  260.278399]  [00000000005d4f4c] unmap_page_range+0x38c/0x900
[  260.278412]  [00000000005d5538] unmap_single_vma+0x78/0xe0
[  260.278425]  [00000000005d5878] unmap_vmas+0x58/0xe0
[  260.278439]  [00000000005dea0c] exit_mmap+0x8c/0x1a0
[  260.278452]  [00000000004690f0] mmput+0x70/0x140
[  260.278467]  [0000000000470d2c] do_exit+0x28c/0xb80
[  260.278479]  [00000000004716cc] do_group_exit+0x2c/0xe0
[  260.278496]  [000000000047d4ac] get_signal+0x3ac/0x640
[  260.278516]  [000000000042db3c] do_signal+0x3c/0x480
[  260.278528]  [000000000042e7d0] do_notify_resume+0x50/0xa0
[  260.278543]  [0000000000404b44] __handle_signal+0xc/0x2c

I suggest someone looking at the code should tackle the unaligned access first.

@behlendorf

This comment has been minimized.

Copy link
Member

commented Jan 10, 2018

@don-brady when you get a chance could you please rebase this on master.

@don-brady don-brady force-pushed the don-brady:zfs-channel-programs branch from 122b827 to 9774d87 Feb 7, 2018

@codecov

This comment has been minimized.

Copy link

commented Feb 7, 2018

Codecov Report

Merging #6558 into master will increase coverage by 0.67%.
The diff coverage is 80.96%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6558      +/-   ##
==========================================
+ Coverage    75.5%   76.17%   +0.67%     
==========================================
  Files         296      324      +28     
  Lines       95908   102928    +7020     
==========================================
+ Hits        72411    78410    +5999     
- Misses      23497    24518    +1021
Flag Coverage Δ
#kernel 75.66% <82.79%> (+0.51%) ⬆️
#user 65.71% <40.77%> (-1.88%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6f259b5...7394206. Read the comment docs.

cwill and others added some commits Feb 8, 2018

OpenZFS 7431 - ZFS Channel Programs
Authored by: Chris Williamson <chris.williamson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Garrett D'Amore <garrett@damore.org>
Ported-by: Don Brady <don.brady@delphix.com>
Ported-by: John Kennedy <john.kennedy@delphix.com>

OpenZFS-issue: https://www.illumos.org/issues/7431
OpenZFS-commit: openzfs/openzfs@dfc1153

Porting Notes:
* The CLI long option arguments for '-t' and '-m' don't parse on linux
* Switched from kmem_alloc to vmem_alloc in zcp_lua_alloc
* Lua implementation is built as its own module (zlua.ko)
* Lua headers consumed directly by zfs code moved to 'include/sys/lua/'
* There is no native setjmp/longjump available in stock Linux kernel.
  Brought over implementations from illumos and FreeBSD
* The get_temporary_prop() was adapted due to VFS platform differences
* Use of inline functions in lua parser to reduce stack usage per C call
* Skip some ZFS Test Suite ZCP tests on sparc64 to avoid stack overflow
OpenZFS 8605 - zfs channel programs fix zfs.exists
Authored by: Chris Williamson <chris.williamson@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Ported-by: Don Brady <don.brady@delphix.com>

zfs.exists() in channel programs doesn't return any result, and should
have a man page entry. This patch corrects zfs.exists so that it
returns a value indicating if the dataset exists or not. It also adds
documentation about it in the man page.

OpenZFS-issue: https://www.illumos.org/issues/8605
OpenZFS-commit: openzfs/openzfs@1e85e11
OpenZFS 8592 - ZFS channel programs - rollback
Authored by: Brad Lewis <brad.lewis@delphix.com>
Reviewed by: Chris Williamson <chris.williamson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Ported-by: Don Brady <don.brady@delphix.com>

ZFS channel programs should be able to perform a rollback.

OpenZFS-issue: https://www.illumos.org/issues/8592
OpenZFS-commit: openzfs/openzfs@d46b5ed
OpenZFS 8600 - ZFS channel programs - snapshot
Authored by: Chris Williamson <chris.williamson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Brad Lewis <brad.lewis@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Ported-by: Don Brady <don.brady@delphix.com>

ZFS channel programs should be able to create snapshots.
In addition to the base snapshot functionality, this entails extra
logic to handle edge cases which were formerly not possible, such as
creating then destroying a snapshot in the same transaction sync.

OpenZFS-issue: https://www.illumos.org/issues/8600
OpenZFS-commit: openzfs/openzfs@68089b8
Add basic functional tests for zcp user properties
Signed-off-by: Don Brady <don.brady@delphix.com>
Increase code coverage for Lua libraries
Add test coverage for lua libraries
Remove dead code in Lua implementation

Signed-off-by: Don Brady <don.brady@delphix.com>
OpenZFS 8604 - Simplify snapshots unmounting code
Authored by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Andy Stormont <astormont@racktopsystems.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Ported-by: Don Brady <don.brady@delphix.com>

Every time we want to unmount a snapshot (happens during snapshot
deletion or renaming) we unnecessarily iterate through all the
mountpoints in the VFS layer (see zfs_get_vfs).

The current patch completely gets rid of that code and changes
the approach while keeping the behavior of that code path the
same. Specifically, it puts a hold on the dataset/snapshot and
gets its vfs resource reference directly, instead of linearly
searching for it. If that reference exists we attempt to amount
it.

With the above change, it became obvious that the nvlist
manipulations that we do (add_boolean and add_nvlist) take a
significant amount of time ensuring uniqueness of every new
element even though they don't have too. Thus, we updated the
patch so those nvlists are not trying to enforce the uniqueness
of their elements.

A more complete analysis of the problem solved by this patch
can be found below:
https://sdimitro.github.io/post/snap-unmount-perf/

OpenZFS-issue: https://www.illumos.org/issues/8604
OpenZFS-commit: openzfs/openzfs@126118f
OpenZFS 8677 - Open-Context Channel Programs
Authored by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Chris Williamson <chris.williamson@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Ported-by: Don Brady <don.brady@delphix.com>

We want to be able to run channel programs outside of synching
context. This would greatly improve performance for channel programs
that just gather information, as they won't have to wait for synching
context anymore.

=== What is implemented?

This feature introduces the following:
- A new command line flag in "zfs program" to specify our intention
  to run in open context. (The -n option)
- A new flag/option within the channel program ioctl which selects
  the context.
- Appropriate error handling whenever we try a channel program in
  open-context that contains zfs.sync* expressions.
- Documentation for the new feature in the manual pages.

=== How do we handle zfs.sync functions in open context?

When such a function is found by the interpreter and we are running
in open context we abort the script and we spit out a descriptive
runtime error. For example, given the script below ...

arg = ...
fs = arg["argv"][1]
err = zfs.sync.destroy(fs)
msg = "destroying " .. fs .. " err=" .. err
return msg

if we run it in open context, we will get back the following error:

Channel program execution failed:
[string "channel program"]:3: running functions from the zfs.sync
submodule requires passing sync=TRUE to lzc_channel_program()
(i.e. do not specify the "-n" command line argument)
stack traceback:
            [C]: in function 'destroy'
            [string "channel program"]:3: in main chunk

=== What about testing?

We've introduced new wrappers for all channel program tests that
run each channel program as both (startard & open-context) and
expect the appropriate behavior depending on the program using
the zfs.sync module.

OpenZFS-issue: https://www.illumos.org/issues/8677
OpenZFS-commit: openzfs/openzfs@17a49e1

@don-brady don-brady force-pushed the don-brady:zfs-channel-programs branch from 9774d87 to 7394206 Feb 8, 2018

@behlendorf behlendorf closed this in 5b72a38 Feb 9, 2018

Nasf-Fan added a commit to Nasf-Fan/zfs that referenced this pull request Feb 13, 2018

OpenZFS 8677 - Open-Context Channel Programs
Authored by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Chris Williamson <chris.williamson@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Ported-by: Don Brady <don.brady@delphix.com>

We want to be able to run channel programs outside of synching
context. This would greatly improve performance for channel programs
that just gather information, as they won't have to wait for synching
context anymore.

=== What is implemented?

This feature introduces the following:
- A new command line flag in "zfs program" to specify our intention
  to run in open context. (The -n option)
- A new flag/option within the channel program ioctl which selects
  the context.
- Appropriate error handling whenever we try a channel program in
  open-context that contains zfs.sync* expressions.
- Documentation for the new feature in the manual pages.

=== How do we handle zfs.sync functions in open context?

When such a function is found by the interpreter and we are running
in open context we abort the script and we spit out a descriptive
runtime error. For example, given the script below ...

arg = ...
fs = arg["argv"][1]
err = zfs.sync.destroy(fs)
msg = "destroying " .. fs .. " err=" .. err
return msg

if we run it in open context, we will get back the following error:

Channel program execution failed:
[string "channel program"]:3: running functions from the zfs.sync
submodule requires passing sync=TRUE to lzc_channel_program()
(i.e. do not specify the "-n" command line argument)
stack traceback:
            [C]: in function 'destroy'
            [string "channel program"]:3: in main chunk

=== What about testing?

We've introduced new wrappers for all channel program tests that
run each channel program as both (startard & open-context) and
expect the appropriate behavior depending on the program using
the zfs.sync module.

OpenZFS-issue: https://www.illumos.org/issues/8677
OpenZFS-commit: openzfs/openzfs@17a49e1
Closes zfsonlinux#6558

tonyhutter added a commit to tonyhutter/spl that referenced this pull request Mar 6, 2018

Remove all spin_is_locked calls
On systems with CONFIG_SMP turned off, spin_is_locked always returns
false causing these assertions to fail. Remove them as suggested in
zfsonlinux/zfs#6558.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: James Cowgill <james.cowgill@mips.com>
Closes zfsonlinux#665

@alek-p alek-p referenced this pull request Mar 7, 2018

Merged

add JSON output support to channel programs #7281

7 of 13 tasks complete

behlendorf added a commit that referenced this pull request Mar 19, 2018

Add JSON output support to channel programs
The changes piggyback JSON output support on top of channel programs 
(#6558).  This way the JSON output support is targeted to scripting 
use cases and is easily maintainable since it really only touches 
one function (zfs_do_channel_program()).

This patch ports Joyent's JSON nvlist library from illumos to enable 
easy JSON printing of channel program output nvlist.  To keep the 
delta small I also took advantage of the fact that printing in
zfs_do_channel_program() was almost always done before exiting 
the program.

Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alek Pinchuk <apinchuk@datto.com>
Closes #7281

tonyhutter added a commit to zfsonlinux/spl that referenced this pull request Mar 19, 2018

Remove all spin_is_locked calls
On systems with CONFIG_SMP turned off, spin_is_locked always returns
false causing these assertions to fail. Remove them as suggested in
zfsonlinux/zfs#6558.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: James Cowgill <james.cowgill@mips.com>
Closes #665
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.