A light-weight process isolation tool, making use of Linux namespaces and seccomp-bpf syscall filters (with help of the kafel bpf language)
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
configs use new kafel features in configs and examples Sep 6, 2018
kafel @ e9bd86f update kafel Sep 6, 2018
.gitignore .gitignore: ignore config.pb.* Oct 1, 2017
.gitmodules config: Initial work on converting config.c to c++ protobuf lib Sep 14, 2017
CONTRIBUTING Initial import May 14, 2015
Dockerfile Build docker image from current source Apr 11, 2018
LICENSE Initial import May 14, 2015
Makefile Makefile: lower -Wformat to 1 Jun 19, 2018
README.md use new kafel features in configs and examples Sep 6, 2018
caps.cc util: remove unused sSnPrintf May 24, 2018
caps.h omit keyword 'struct' Feb 10, 2018
cgroup.cc cgroup: refactor cgroup code Jul 26, 2018
cgroup.h omit keyword 'struct' Feb 10, 2018
cmdline.cc cmdline: more stderr_to_null closer to is_silent Jun 25, 2018
cmdline.h omit keyword 'struct' Feb 10, 2018
config.cc config: correct way of setting pass_fd Jul 31, 2018
config.h omit keyword 'struct' Feb 10, 2018
config.proto Update config.proto Jul 31, 2018
contain.cc config: Implement --stderr_to_null Jun 25, 2018
contain.h omit keyword 'struct' Feb 10, 2018
cpu.cc log: rename log to logs due to clash with glibc's log Feb 10, 2018
cpu.h omit keyword 'struct' Feb 10, 2018
logs.cc move isatty after log_fd is set Jun 7, 2018
logs.h logs: use log file/level immediately Jun 7, 2018
macros.h rename ARRAYSIZE to ARR_SZ due to clash with protobufs headers Feb 13, 2018
mnt.cc mnt: function rename Jul 28, 2018
mnt.h mnt: move mnt_t to std::string Feb 11, 2018
net.cc net: use memset to init stack structs Jun 20, 2018
net.h net: convert net::connToText to std::string Feb 10, 2018
nsjail.1 README.md, nsjail.1: add --stderr_to_null option Jul 14, 2018
nsjail.cc subproc: reap processes after killing Jul 27, 2018
nsjail.h cmdline: more stderr_to_null closer to is_silent Jun 25, 2018
pid.cc log: rename log to logs due to clash with glibc's log Feb 10, 2018
pid.h omit keyword 'struct' Feb 10, 2018
sandbox.cc sandbox: casting for syscall() May 23, 2018
sandbox.h nsjail: free seccomp filter upon nsjail exit Feb 12, 2018
subproc.cc configs/bash: add noexec/nodev/nosuid to a mount Jul 27, 2018
subproc.h subproc: reap processes after killing Jul 27, 2018
user.cc use strtoimax when needed May 26, 2018
user.h cmdline: simplify string splitting Feb 11, 2018
util.cc util: c++ version of sprintf Jun 16, 2018
util.h cgroup: refactor cgroup code Jul 26, 2018
uts.cc uts: simplify sethostname Feb 14, 2018
uts.h omit keyword 'struct' Feb 10, 2018

README.md


This is NOT an official Google product.


Overview

NsJail is a process isolation tool for Linux. It utilizes Linux namespace subsystem, resource limits, and the seccomp-bpf syscall filters of the Linux kernel.

It can help you with (among other things):

  • Isolating networking services (e.g. web, time, DNS), by isolating them from the rest of the OS
  • Hosting computer security challenges (so-called CTFs)
  • Containing invasive syscall-level OS fuzzers

Features:


What forms of isolation does it provide

  1. Linux namespaces: UTS (hostname), MOUNT (chroot), PID (separate PID tree), IPC, NET (separate networking context), USER, CGROUPS
  2. FS constraints: chroot(), pivot_root(), RO-remounting, custom /proc and tmpfs mount points
  3. Resource limits (wall-time/CPU time limits, VM/mem address space limits, etc.)
  4. Programmable seccomp-bpf syscall filters (through the kafel language)
  5. Cloned and isolated Ethernet interfaces
  6. Cgroups for memory and PID utilization control

Which use-cases are supported

Isolation of network services (inetd style)

PS: You'll need to have a valid file-system tree in /chroot. If you don't have it, change /chroot to /

  • Server:
 $ ./nsjail -Ml --port 9000 --chroot /chroot/ --user 99999 --group 99999 -- /bin/sh -i
  • Client:
 $ nc 127.0.0.1 9000
 / $ ifconfig
 / $ ifconfig -a
 lo    Link encap:Local Loopback
       LOOPBACK  MTU:65536  Metric:1
       RX packets:0 errors:0 dropped:0 overruns:0 frame:0
       TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0
       RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
 / $ ps wuax
 PID   USER     COMMAND
 1 99999    /bin/sh -i
 3 99999    {busybox} ps wuax
 / $

Isolation with access to a private, cloned interface (requires root/setuid)

PS: You'll need to have a valid file-system tree in /chroot. If you don't have it, change /chroot to /

$ sudo ./nsjail --user 9999 --group 9999 --macvlan_iface eth0 --chroot /chroot/ -Mo --macvlan_vs_ip 192.168.0.44 --macvlan_vs_nm 255.255.255.0 --macvlan_vs_gw 192.168.0.1 -- /bin/sh -i
/ $ id
uid=9999 gid=9999
/ $ ip addr sh
1: lo:  mtu 65536 qdisc noqueue 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: vs:  mtu 1500 qdisc noqueue 
    link/ether ca:a2:69:21:33:66 brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.44/24 brd 192.168.0.255 scope global vs
       valid_lft forever preferred_lft forever
    inet6 fe80::c8a2:69ff:fe21:cd66/64 scope link 
       valid_lft forever preferred_lft forever
/ $ nc 217.146.165.209 80
GET / HTTP/1.0

HTTP/1.0 302 Found
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Location: https://www.google.ch/?gfe_rd=cr&ei=cEzWVrG2CeTI8ge88ofwDA
Content-Length: 258
Date: Wed, 02 Mar 2016 02:14:08 GMT

...
...
/ $ 

Isolation of local processes

PS: You'll need to have a valid file-system tree in /chroot. If you don't have it, change /chroot to /

 $ ./nsjail -Mo --chroot /chroot/ --user 99999 --group 99999 -- /bin/sh -i
 / $ ifconfig -a
 lo    Link encap:Local Loopback
       LOOPBACK  MTU:65536  Metric:1
       RX packets:0 errors:0 dropped:0 overruns:0 frame:0
       TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0
       RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
 / $ id
 uid=99999 gid=99999
 / $ ps wuax
 PID   USER     COMMAND
 1 99999    /bin/sh -i
 4 99999    {busybox} ps wuax
 / $exit
 $

Isolation of local processes (and re-running them, if necessary)

PS: You'll need to have a valid file-system tree in /chroot. If you don't have it, change /chroot to /

 $ ./nsjail -Mr --chroot /chroot/ --user 99999 --group 99999 -- /bin/sh -i
 BusyBox v1.21.1 (Ubuntu 1:1.21.0-1ubuntu1) built-in shell (ash)
 Enter 'help' for a list of built-in commands.
 / $ ps wuax
 PID   USER     COMMAND
 1 99999    /bin/sh -i
 2 99999    {busybox} ps wuax
 / $ exit
 BusyBox v1.21.1 (Ubuntu 1:1.21.0-1ubuntu1) built-in shell (ash)
 Enter 'help' for a list of built-in commands.
 / $ ps wuax
 PID   USER     COMMAND
 1 99999    /bin/sh -i
 2 99999    {busybox} ps wuax
 / $

Bash in a minimal file-system with uid==0 and access to /dev/urandom only

$ ./nsjail -Mo --user 0 --group 99999 -R /bin/ -R /lib -R /lib64/ -R /usr/ -R /sbin/ -T /dev -R /dev/urandom --keep_caps -- /bin/bash -i
[2017-05-24T17:08:02+0200] Mode: STANDALONE_ONCE
[2017-05-24T17:08:02+0200] Jail parameters: hostname:'NSJAIL', chroot:'(null)', process:'/bin/bash', bind:[::]:0, max_conns_per_ip:0, time_limit:0, personality:0, daemonize:false, clone_newnet:true, clone_newuser:true, clone_newns:true, clone_newpid:true, clone_newipc:true, clonew_newuts:true, clone_newcgroup:false, keep_caps:true, tmpfs_size:4194304, disable_no_new_privs:false, pivot_root_only:false
[2017-05-24T17:08:02+0200] Mount point: src:'none' dst:'/' type:'tmpfs' flags:MS_RDONLY|0 options:'' isDir:True
[2017-05-24T17:08:02+0200] Mount point: src:'none' dst:'/proc' type:'proc' flags:MS_RDONLY|0 options:'' isDir:True
[2017-05-24T17:08:02+0200] Mount point: src:'/bin/' dst:'/bin/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:08:02+0200] Mount point: src:'/lib' dst:'/lib' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:08:02+0200] Mount point: src:'/lib64/' dst:'/lib64/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:08:02+0200] Mount point: src:'/usr/' dst:'/usr/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:08:02+0200] Mount point: src:'/sbin/' dst:'/sbin/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:08:02+0200] Mount point: src:'none' dst:'/dev' type:'tmpfs' flags:0 options:'size=4194304' isDir:True
[2017-05-24T17:08:02+0200] Mount point: src:'/dev/urandom' dst:'/dev/urandom' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:False
[2017-05-24T17:08:02+0200] Uid map: inside_uid:0 outside_uid:69664
[2017-05-24T17:08:02+0200] Gid map: inside_gid:99999 outside_gid:5000
[2017-05-24T17:08:02+0200] Executing '/bin/bash' for '[STANDALONE_MODE]'
bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell
bash-4.3# ls -l
total 28
drwxr-xr-x   2 65534 65534  4096 May 15 14:04 bin
drwxrwxrwt   2     0 99999    60 May 24 15:08 dev
drwxr-xr-x  28 65534 65534  4096 May 15 14:10 lib
drwxr-xr-x   2 65534 65534  4096 May 15 13:56 lib64
dr-xr-xr-x 391 65534 65534     0 May 24 15:08 proc
drwxr-xr-x   2 65534 65534 12288 May 15 14:16 sbin
drwxr-xr-x  17 65534 65534  4096 May 15 13:58 usr
bash-4.3# id
uid=0 gid=99999 groups=65534,99999
bash-4.3# exit
exit
[2017-05-24T17:08:05+0200] PID: 129839 exited with status: 0, (PIDs left: 0)

/usr/bin/find in a minimal file-system (only /usr/bin/find accessible from /usr/bin)

$ ./nsjail -Mo --user 99999 --group 99999 -R /lib/x86_64-linux-gnu/ -R /lib/x86_64-linux-gnu -R /lib64 -R /usr/bin/find -R /dev/urandom --keep_caps -- /usr/bin/find / | wc -l
[2017-05-24T17:04:37+0200] Mode: STANDALONE_ONCE
[2017-05-24T17:04:37+0200] Jail parameters: hostname:'NSJAIL', chroot:'(null)', process:'/usr/bin/find', bind:[::]:0, max_conns_per_ip:0, time_limit:0, personality:0, daemonize:false, clone_newnet:true, clone_newuser:true, clone_newns:true, clone_newpid:true, clone_newipc:true, clonew_newuts:true, clone_newcgroup:false, keep_caps:true, tmpfs_size:4194304, disable_no_new_privs:false, pivot_root_only:false
[2017-05-24T17:04:37+0200] Mount point: src:'none' dst:'/' type:'tmpfs' flags:MS_RDONLY|0 options:'' isDir:True
[2017-05-24T17:04:37+0200] Mount point: src:'none' dst:'/proc' type:'proc' flags:MS_RDONLY|0 options:'' isDir:True
[2017-05-24T17:04:37+0200] Mount point: src:'/lib/x86_64-linux-gnu/' dst:'/lib/x86_64-linux-gnu/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:04:37+0200] Mount point: src:'/lib/x86_64-linux-gnu' dst:'/lib/x86_64-linux-gnu' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:04:37+0200] Mount point: src:'/lib64' dst:'/lib64' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:04:37+0200] Mount point: src:'/usr/bin/find' dst:'/usr/bin/find' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:False
[2017-05-24T17:04:37+0200] Mount point: src:'/dev/urandom' dst:'/dev/urandom' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:False
[2017-05-24T17:04:37+0200] Uid map: inside_uid:99999 outside_uid:69664
[2017-05-24T17:04:37+0200] Gid map: inside_gid:99999 outside_gid:5000
[2017-05-24T17:04:37+0200] Executing '/usr/bin/find' for '[STANDALONE_MODE]'
/usr/bin/find: `/proc/tty/driver': Permission denied
2289
[2017-05-24T17:04:37+0200] PID: 129525 exited with status: 1, (PIDs left: 0)

Using /etc/subuid

$ tail -n1 /etc/subuid
user:10000000:1
$ ./nsjail -R /lib -R /lib64/ -R /usr/lib -R /usr/bin/ -R /usr/sbin/ -R /bin/ -R /sbin/ -R /dev/null -U 0:10000000:1 -u 0 -R /tmp/ -T /tmp/ -- /bin/ls -l /usr/
[2017-05-24T17:12:31+0200] Mode: STANDALONE_ONCE
[2017-05-24T17:12:31+0200] Jail parameters: hostname:'NSJAIL', chroot:'(null)', process:'/bin/ls', bind:[::]:0, max_conns_per_ip:0, time_limit:0, personality:0, daemonize:false, clone_newnet:true, clone_newuser:true, clone_newns:true, clone_newpid:true, clone_newipc:true, clonew_newuts:true, clone_newcgroup:false, keep_caps:false, tmpfs_size:4194304, disable_no_new_privs:false, pivot_root_only:false
[2017-05-24T17:12:31+0200] Mount point: src:'none' dst:'/' type:'tmpfs' flags:MS_RDONLY|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'none' dst:'/proc' type:'proc' flags:MS_RDONLY|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'/lib' dst:'/lib' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'/lib64/' dst:'/lib64/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'/usr/lib' dst:'/usr/lib' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'/usr/bin/' dst:'/usr/bin/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'/usr/sbin/' dst:'/usr/sbin/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'/bin/' dst:'/bin/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'/sbin/' dst:'/sbin/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'/dev/null' dst:'/dev/null' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:False
[2017-05-24T17:12:31+0200] Mount point: src:'/tmp/' dst:'/tmp/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'none' dst:'/tmp/' type:'tmpfs' flags:0 options:'size=4194304' isDir:True
[2017-05-24T17:12:31+0200] Uid map: inside_uid:0 outside_uid:69664
[2017-05-24T17:12:31+0200] Gid map: inside_gid:5000 outside_gid:5000
[2017-05-24T17:12:31+0200] Newuid mapping: inside_uid:'0' outside_uid:'10000000' count:'1'
[2017-05-24T17:12:31+0200] Executing '/bin/ls' for '[STANDALONE_MODE]'
total 120
drwxr-xr-x   5 65534 65534 77824 May 24 12:25 bin
drwxr-xr-x 210 65534 65534 20480 May 22 16:11 lib
drwxr-xr-x   4 65534 65534 20480 May 24 00:24 sbin
[2017-05-24T17:12:31+0200] PID: 130841 exited with status: 0, (PIDs left: 0)

Even more contrained shell (with seccomp-bpf policies)

$ ./nsjail --chroot / --seccomp_string 'ALLOW { write, execve, brk, access, mmap, open, openat, newfstat, close, read, mprotect, arch_prctl, munmap, getuid, getgid, getpid, rt_sigaction, geteuid, getppid, getcwd, getegid, ioctl, fcntl, newstat, clone, wait4, rt_sigreturn, exit_group } DEFAULT KILL' -- /bin/sh -i
[2017-01-15T21:53:08+0100] Mode: STANDALONE_ONCE
[2017-01-15T21:53:08+0100] Jail parameters: hostname:'NSJAIL', chroot:'/', process:'/bin/sh', bind:[::]:0, max_conns_per_ip:0, uid:(ns:1000, global:1000), gid:(ns:1000, global:1000), time_limit:0, personality:0, daemonize:false, clone_newnet:true, clone_newuser:true, clone_newns:true, clone_newpid:true, clone_newipc:true, clonew_newuts:true, clone_newcgroup:false, keep_caps:false, tmpfs_size:4194304, disable_no_new_privs:false, pivot_root_only:false
[2017-01-15T21:53:08+0100] Mount point: src:'/' dst:'/' type:'' flags:0x5001 options:''
[2017-01-15T21:53:08+0100] Mount point: src:'(null)' dst:'/proc' type:'proc' flags:0x0 options:''
[2017-01-15T21:53:08+0100] PID: 18873 about to execute '/bin/sh' for [STANDALONE_MODE]
/bin/sh: 0: can't access tty; job control turned off
$ set
IFS='
'
OPTIND='1'
PATH='/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'
PPID='0'
PS1='$ '
PS2='> '
PS4='+ '
PWD='/'
$ id
Bad system call
$ exit
[2017-01-15T21:53:17+0100] PID: 18873 exited with status: 159, (PIDs left: 0)

Configuration file

You will also find all examples in the configs directory.


config.proto contains ProtoBuf schema for nsjail's configuration format.


You can examine an example config file in configs/bash-with-fake-geteuid.cfg.

Usage:

$ ./nsjail --config configs/bash-with-fake-geteuid.cfg

You can also override certain options with command-line options. Here, the executed binary (/bin/bash) is overriden with /usr/bin/id, yet options from configs/bash-with-fake-geteuid.cfg still apply

$ ./nsjail --config configs/bash-with-fake-geteuid.cfg -- /usr/bin/id
...
[INSIDE-JAIL]: id
uid=999999 gid=999998 euid=4294965959 groups=999998,65534
[INSIDE-JAIL]: exit
[2017-05-27T18:45:40+0200] PID: 16579 exited with status: 0, (PIDs left: 0)

You might also want to try using configs/home-documents-with-xorg-no-net.cfg.

$ ./nsjail --config configs/home-documents-with-xorg-no-net.cfg -- /usr/bin/evince /user/Documents/doc.pdf
$ ./nsjail --config configs/home-documents-with-xorg-no-net.cfg -- /usr/bin/geeqie /user/Documents/
$ ./nsjail --config configs/home-documents-with-xorg-no-net.cfg -- /usr/bin/gv /user/Documents/doc.pdf
$ ./nsjail --config configs/home-documents-with-xorg-no-net.cfg -- /usr/bin/mupdf /user/Documents/doc.pdf

The configs/firefox-with-net.cfg config file will allow you to run firefox inside a sandboxed environment:

$ ./nsjail --config configs/firefox-with-net.cfg

A more complex setup, which utilizes virtualized (cloned) Ethernet interfaces (to separate it from the main network namespace), can be found in configs/firefox-with-cloned-net.cfg. Remember to change relevant UIDs and Ethernet interface names before use.

As using cloned Ethernet interfaces (MACVTAP) required root privileges, you'll have to run it under sudo:

$ sudo ./nsjail --config configs/firefox-with-cloned-net.cfg

More info

The command-line options should be self-explanatory, while the proto-buf config options are described in config.proto

./nsjail --help
Usage: ./nsjail [options] -- path_to_command [args]
Options:
 --help|-h 
	Help plz..
 --mode|-M VALUE
	Execution mode (default: 'o' [MODE_STANDALONE_ONCE]):
	l: Wait for connections on a TCP port (specified with --port) [MODE_LISTEN_TCP]
	o: Launch a single process on the console using clone/execve [MODE_STANDALONE_ONCE]
	e: Launch a single process on the console using execve [MODE_STANDALONE_EXECVE]
	r: Launch a single process on the console with clone/execve, keep doing it forever [MODE_STANDALONE_RERUN]
 --config|-C VALUE
	Configuration file in the config.proto ProtoBuf format (see configs/ directory for examples)
 --exec_file|-x VALUE
	File to exec (default: argv[0])
 --execute_fd 
	Use execveat() to execute a file-descriptor instead of executing the binary path. In such case argv[0]/exec_file denotes a file path before mount namespacing
 --chroot|-c VALUE
	Directory containing / of the jail (default: none)
 --rw 
	Mount chroot dir (/) R/W (default: R/O)
 --user|-u VALUE
	Username/uid of processess inside the jail (default: your current uid). You can also use inside_ns_uid:outside_ns_uid:count convention here. Can be specified multiple times
 --group|-g VALUE
	Groupname/gid of processess inside the jail (default: your current gid). You can also use inside_ns_gid:global_ns_gid:count convention here. Can be specified multiple times
 --hostname|-H VALUE
	UTS name (hostname) of the jail (default: 'NSJAIL')
 --cwd|-D VALUE
	Directory in the namespace the process will run (default: '/')
 --port|-p VALUE
	TCP port to bind to (enables MODE_LISTEN_TCP) (default: 0)
 --bindhost VALUE
	IP address to bind the port to (only in [MODE_LISTEN_TCP]), (default: '::')
 --max_conns_per_ip|-i VALUE
	Maximum number of connections per one IP (only in [MODE_LISTEN_TCP]), (default: 0 (unlimited))
 --log|-l VALUE
	Log file (default: use log_fd)
 --log_fd|-L VALUE
	Log FD (default: 2)
 --time_limit|-t VALUE
	Maximum time that a jail can exist, in seconds (default: 600)
 --max_cpus VALUE
	Maximum number of CPUs a single jailed process can use (default: 0 'no limit')
 --daemon|-d 
	Daemonize after start
 --verbose|-v 
	Verbose output
 --quiet|-q 
	Log warning and more important messages only
 --really_quiet|-Q 
	Log fatal messages only
 --keep_env|-e 
	Pass all environment variables to the child process (default: all envvars are cleared)
 --env|-E VALUE
	Additional environment variable (can be used multiple times)
 --keep_caps 
	Don't drop any capabilities
 --cap VALUE
	Retain this capability, e.g. CAP_PTRACE (can be specified multiple times)
 --silent 
	Redirect child process' fd:0/1/2 to /dev/null
 --stderr_to_null
	Redirect FD=2 (STDERR_FILENO) to /dev/null
 --skip_setsid 
	Don't call setsid(), allows for terminal signal handling in the sandboxed process. Dangerous
 --pass_fd VALUE
	Don't close this FD before executing the child process (can be specified multiple times), by default: 0/1/2 are kept open
 --disable_no_new_privs 
	Don't set the prctl(NO_NEW_PRIVS, 1) (DANGEROUS)
 --rlimit_as VALUE
	RLIMIT_AS in MB, 'max' or 'hard' for the current hard limit, 'def' or 'soft' for the current soft limit, 'inf' for RLIM64_INFINITY (default: 512)
 --rlimit_core VALUE
	RLIMIT_CORE in MB, 'max' or 'hard' for the current hard limit, 'def' or 'soft' for the current soft limit, 'inf' for RLIM64_INFINITY (default: 0)
 --rlimit_cpu VALUE
	RLIMIT_CPU, 'max' or 'hard' for the current hard limit, 'def' or 'soft' for the current soft limit, 'inf' for RLIM64_INFINITY (default: 600)
 --rlimit_fsize VALUE
	RLIMIT_FSIZE in MB, 'max' or 'hard' for the current hard limit, 'def' or 'soft' for the current soft limit, 'inf' for RLIM64_INFINITY (default: 1)
 --rlimit_nofile VALUE
	RLIMIT_NOFILE, 'max' or 'hard' for the current hard limit, 'def' or 'soft' for the current soft limit, 'inf' for RLIM64_INFINITY (default: 32)
 --rlimit_nproc VALUE
	RLIMIT_NPROC, 'max' or 'hard' for the current hard limit, 'def' or 'soft' for the current soft limit, 'inf' for RLIM64_INFINITY (default: 'soft')
 --rlimit_stack VALUE
	RLIMIT_STACK in MB, 'max' or 'hard' for the current hard limit, 'def' or 'soft' for the current soft limit, 'inf' for RLIM64_INFINITY (default: 'soft')
 --persona_addr_compat_layout 
	personality(ADDR_COMPAT_LAYOUT)
 --persona_mmap_page_zero 
	personality(MMAP_PAGE_ZERO)
 --persona_read_implies_exec 
	personality(READ_IMPLIES_EXEC)
 --persona_addr_limit_3gb 
	personality(ADDR_LIMIT_3GB)
 --persona_addr_no_randomize 
	personality(ADDR_NO_RANDOMIZE)
 --disable_clone_newnet|-N 
	Don't use CLONE_NEWNET. Enable global networking inside the jail
 --disable_clone_newuser 
	Don't use CLONE_NEWUSER. Requires euid==0
 --disable_clone_newns 
	Don't use CLONE_NEWNS
 --disable_clone_newpid 
	Don't use CLONE_NEWPID
 --disable_clone_newipc 
	Don't use CLONE_NEWIPC
 --disable_clone_newuts 
	Don't use CLONE_NEWUTS
 --disable_clone_newcgroup 
	Don't use CLONE_NEWCGROUP. Might be required for kernel versions < 4.6
 --uid_mapping|-U VALUE
	Add a custom uid mapping of the form inside_uid:outside_uid:count. Setting this requires newuidmap (set-uid) to be present
 --gid_mapping|-G VALUE
	Add a custom gid mapping of the form inside_gid:outside_gid:count. Setting this requires newgidmap (set-uid) to be present
 --bindmount_ro|-R VALUE
	List of mountpoints to be mounted --bind (ro) inside the container. Can be specified multiple times. Supports 'source' syntax, or 'source:dest'
 --bindmount|-B VALUE
	List of mountpoints to be mounted --bind (rw) inside the container. Can be specified multiple times. Supports 'source' syntax, or 'source:dest'
 --tmpfsmount|-T VALUE
	List of mountpoints to be mounted as tmpfs (R/W) inside the container. Can be specified multiple times. Supports 'dest' syntax. Alternatively, use '-m none:dest:tmpfs:size=8388608'
 --mount|-m VALUE
	Arbitrary mount, format src:dst:fs_type:options
 --symlink|-s VALUE
	Symlink, format src:dst
 --disable_proc 
	Disable mounting procfs in the jail
 --proc_path VALUE
	Path used to mount procfs (default: '/proc')
 --proc_rw 
	Is procfs mounted as R/W (default: R/O)
 --seccomp_policy|-P VALUE
	Path to file containing seccomp-bpf policy (see kafel/)
 --seccomp_string VALUE
	String with kafel seccomp-bpf policy (see kafel/)
 --seccomp_log 
	Use SECCOMP_FILTER_FLAG_LOG. Log all actions except SECCOMP_RET_ALLOW). Supported since kernel version 4.14
 --cgroup_mem_max VALUE
	Maximum number of bytes to use in the group (default: '0' - disabled)
 --cgroup_mem_mount VALUE
	Location of memory cgroup FS (default: '/sys/fs/cgroup/memory')
 --cgroup_mem_parent VALUE
	Which pre-existing memory cgroup to use as a parent (default: 'NSJAIL')
 --cgroup_pids_max VALUE
	Maximum number of pids in a cgroup (default: '0' - disabled)
 --cgroup_pids_mount VALUE
	Location of pids cgroup FS (default: '/sys/fs/cgroup/pids')
 --cgroup_pids_parent VALUE
	Which pre-existing pids cgroup to use as a parent (default: 'NSJAIL')
 --cgroup_net_cls_classid VALUE
	Class identifier of network packets in the group (default: '0' - disabled)
 --cgroup_net_cls_mount VALUE
	Location of net_cls cgroup FS (default: '/sys/fs/cgroup/net_cls')
 --cgroup_net_cls_parent VALUE
	Which pre-existing net_cls cgroup to use as a parent (default: 'NSJAIL')
 --cgroup_cpu_ms_per_sec VALUE
	Number of us that the process group can use per second (default: '0' - disabled)
 --cgroup_cpu_mount VALUE
	Location of cpu cgroup FS (default: '/sys/fs/cgroup/net_cls')
 --cgroup_cpu_parent VALUE
	Which pre-existing cpu cgroup to use as a parent (default: 'NSJAIL')
 --iface_no_lo 
	Don't bring the 'lo' interface up
 --iface_own VALUE
	Move this existing network interface into the new NET namespace. Can be specified multiple times
 --macvlan_iface|-I VALUE
	Interface which will be cloned (MACVLAN) and put inside the subprocess' namespace as 'vs'
 --macvlan_vs_ip VALUE
	IP of the 'vs' interface (e.g. "192.168.0.1")
 --macvlan_vs_nm VALUE
	Netmask of the 'vs' interface (e.g. "255.255.255.0")
 --macvlan_vs_gw VALUE
	Default GW for the 'vs' interface (e.g. "192.168.0.1")

 Examples: 
 Wait on a port 31337 for connections, and run /bin/sh
  nsjail -Ml --port 31337 --chroot / -- /bin/sh -i
 Re-run echo command as a sub-process
  nsjail -Mr --chroot / -- /bin/echo "ABC"
 Run echo command once only, as a sub-process
  nsjail -Mo --chroot / -- /bin/echo "ABC"
 Execute echo command directly, without a supervising process
  nsjail -Me --chroot / --disable_proc -- /bin/echo "ABC"

Launching in Docker

To launch nsjail in a docker container clone the repository and build the docker image:

docker build -t nsjailcontainer .

This will build up an image containing njsail and kafel.

From now you can either use it in another Dockerfile (FROM nsjailcontainer) or directly:

docker run --privileged --rm -it nsjailcontainer nsjail --user 99999 --group 99999 --disable_proc --chroot / --time_limit 30 /bin/bash

Contact