Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: auditd logs for clone syscall shows exit code as container namespace pid of child process instead of host namespace pid #68

Open
madzcar opened this issue Sep 20, 2017 · 7 comments

Comments

@madzcar
Copy link

madzcar commented Sep 20, 2017

Hi ,

Consider the following setup:
Linux kernel 4.4.87-1.el7.elrepo.x86_64
audit-2.7.6-3.el7.x86_64
OS : Centos 7
docker :docker-ce-17.06.2.ce-1.el7.centos.x86_64
One docker container running a vsftpd server.
[root@localhost ~]# docker top 4ef7e4d900ff

[root@4ef7e4d900ff /]# ps -efa |grep vsftpd
root 33 1 0 10:35 ? 00:00:00 /usr/sbin/vsftpd
nobody 369 33 0 17:16 ? 00:00:00 /usr/sbin/vsftpd
ftpadmin 371 369 0 17:16 ? 00:00:00 /usr/sbin/vsftpd

[root@localhost ~]# docker top 4ef7e4d900ff
UID PID PPID C STIME TTY TIME CMD

root 2493 2439 0 16:05 ? 00:00:00 /usr/sbin/vsftpd
nobody 3179 2493 0 22:46 ? 00:00:00 /usr/sbin/vsftpd
1000 3181 3179 0 22:46 ? 00:00:00 /usr/sbin/vsftpd

The audit logs generated when a ftp connection is made, are showing up as below:
type=SYSCALL msg=audit(1505927792.724:8995): arch=c000003e syscall=43 success=yes exit=11 a0=6 a1=0 a2=0 a3=186a0 items=0 ppid=1869 pid=3143 auid=4294967295 uid=89 gid=89 euid=89 suid=89 fsuid=89 egid=89 sgid=89 fsgid=89 tty=(none) ses=4294967295 comm="pickup" exe="/usr/libexec/postfix/pickup" subj=system_u:system_r:postfix_pickup_t:s0 key="check_accept"

type=SYSCALL msg=audit(1505927787.052:8987): arch=c000003e syscall=56 success=yes exit=369 a0=1200011 a1=0 a2=0 a3=7f6df229fb50 items=0 ppid=2439 pid=2493 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="vsftpd" exe="/usr/sbin/vsftpd" subj=system_u:system_r:spc_t:s0 key="check_clone"

type=SYSCALL msg=audit(1505927787.054:8992): arch=c000003e syscall=56 success=yes exit=370 a0=1200011 a1=0 a2=0 a3=7f6df229fb50 items=0 ppid=2493 pid=3179 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="vsftpd" exe="/usr/sbin/vsftpd" subj=system_u:system_r:spc_t:s0 key="check_clone"

type=SYSCALL msg=audit(1505927793.609:9009): arch=c000003e syscall=56 success=yes exit=371 a0=1200011 a1=0 a2=0 a3=7f6df229fb50 items=0 ppid=2493 pid=3179 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="vsftpd" exe="/usr/sbin/vsftpd" subj=system_u:system_r:spc_t:s0 key="check_clone"

The exit code in the clone call logs shows the pid of the process which is within the container namespace, whereas the pid and ppids shown are host based pids. I believe these should be consistent, in the sense that the exit codes in the logs should be mapped to the actual host pids in the logs. This inconsistency is currently causing difficulties in tracking the child pid generated from the last clone call from the logs.

@rgbriggs
Copy link
Member

It could be as simple as:

rcu_read_lock()
return_code = task_tgid_nr(find_task_by_vpid(return_code))
rcu_read_unlock()

OR

rcu_read_lock()
return_code = pid_nr(find_vpid(int return_code);
rcu_read_unlock()

in audit_syscall_exit() just before the __audit_syscall_exit() call in include/linux/audit.h but I don't know which process context it is in at the time.

If it isn't that simple, it may need some work from all the callers of audit_syscall_exit() in each arch.

@rgbriggs
Copy link
Member

Actually, that would be better placed at the top of __audit_syscall_exit() if possible and maybe even audit_take_context().

@pcmoore pcmoore changed the title auditd logs for clone syscall shows exit code as container namespace pid of child process instead of host namespace pid BUG: auditd logs for clone syscall shows exit code as container namespace pid of child process instead of host namespace pid Nov 30, 2017
@madzcar
Copy link
Author

madzcar commented Jan 4, 2018

Guys,

Should I submit a patch to you guys for this ? I have tested what rgbriggs has mentioned above and it seems to work fine. It would be nice to have it asap.

@pcmoore
Copy link
Member

pcmoore commented Jan 4, 2018

@madzcar I haven't had time to look into this issue, or the solution that @rgbriggs proposed, but you can always feel free to submit patches to the mailing list and we can review/discuss them there (we don't use GitHub PRs for kernel patches, although we do use the GitHub issue tracker for pre-patch discussion and tracking).

@rgbriggs
Copy link
Member

rgbriggs commented Jan 8, 2018

Replacing this exit code with the PID in the initial PID namespace will mess up all callers (clone(2), fork(2), vfork(2) kernel_thread(), do_fork()), who expect the return value in the caller's PID namespace if it is not the initial PID namespace, so that won't work. The return value is technically correct for the PID namespace from which it was called and reported correctly in the audit record as the return value of that call.

@madzcar, the way you are trying to interpret the results from the audit record is clever, but not going to work without another way to translate that value lifted out of the audit record. I don't know if there is a userspace tool or call to translate PIDs between namespaces.

I recommend closing this as not a bug.

@madzcar
Copy link
Author

madzcar commented Jan 8, 2018

Rgb,
Appreciate your response on this issue. However, if you would notice the pid and ppid values in the same log is in the initial namespace, while the exit code is in a different namespace. Doesnt this make the audit log inconsistent? How is an application supposed to analyse the logs when a single log spans multiple namespaces ? How does an application parsing the logs get to know if its actually a different pid namespace other than relying on audit to provide consistent information in the log?

I agree it can mess up other system calls, but atleast in this case for clone, this issue should be fixed somehow. The audit logs having pid and ppids in different namespace and exit codes in different namespace is just making the log impossible to parse. Some of these connections(ssh etc) can be extremely shortlived and as such audit logs are the only way to track such events which is not possible without fixing this issue.

Shouldnt we use one namespace in the log and as such all pid, ppids and exit codes should be translated to that namespace for the log ?

@rgbriggs
Copy link
Member

rgbriggs commented Jan 9, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants