Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interactive bash command might be missed/lost under some circumstances #490

Closed
ruitianzhong opened this issue Feb 25, 2024 · 12 comments
Closed
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed

Comments

@ruitianzhong
Copy link
Contributor

ruitianzhong commented Feb 25, 2024

Describe the bug

Under some circumstances bash command would be missed, which would not occur when using bashreadline from BCC by contrast.

Catching all interactive bash shell commands in my opinion is critically important for the auditing.

To Reproduce

start ecapture under bash mode:

sudo ./ecapture bash

Example 1

exec echo hello >> output

ecapture will show nothing in this case.

Example 2

if echo hello ;
then 
ls
fi

ecapture will only show fi in this case.

By contrast, ecapture's counterpart bashreadline show correct results even when executing interactive bash expression which does not track returned value at all.

Analysis

This bug is due to the track of return value of the executed program.

In example 1, the built-in exec command would not fork a process to execute new program. Instead, it directly call execve in the process of bash itself, which cause execute_command() in Bash never return (verified through debugging). Thus ecapture can not get the return value and print out the command by hooking execute_command(). It will also cause memory leak because bpf_map_delete_elem will not be executed any time soon.

In example 2, although the first three lines have been recorded in the map by the order (override the previous one), they are not printed out because they have not been really executed and thus have no return value leading to the printed message. When fi is input finally, it has return value and is printed out as a result.

Fix option
In my opinion, it would be better to not track return value in bash mode. Actually not every line of interactive bash command has a return value and it might introduce overhead to accurately track every line of interactive bash command while catching the return value.

If needed, I can take a Pull Request to fix it.

@cfc4n cfc4n added bug Something isn't working good first issue Good for newcomers labels Feb 25, 2024
@cfc4n
Copy link
Member

cfc4n commented Feb 25, 2024

Example 1 is works in my Server. how to reproduce?

Example 2 run failed. Indeed, this is a bug.

@ruitianzhong
Copy link
Contributor Author

Screencast.from.02-25-2024.04.20.16.PM.webm

Here is the demo for Example 1.

Some contexts:
OS: Ubuntu 22.04 (installed on bare metal)
Bash: 5.1.16
ecapture: commit 2127596

@cfc4n
Copy link
Member

cfc4n commented Feb 28, 2024

I reproduced it too. The key to reproducibility is whether to execute the bash command. Can you pinpoint the reason and send a PR?

@ruitianzhong
Copy link
Contributor Author

Screencast.from.02-28-2024.08.59.42.PM.webm

Actually, in my computer without executing the bash command, the bug can still be reproduced. I suspect that it is related to the bash configuration.

So, I try to reproduced the similar result as yours in the video above after setting exec's alias to "".

To confirm it, run in your initial bash shell :

type exec

As for the fix for the missing command, in my opinion, more discussion is needed before really finding out the relatively weird behavior on your computer. Also the fix option I proposed (i.e., removing the support for returned value tracking) is a breaking change( removing a feature ), which may significantly contradicts your original design decision and the use case for eCapture.

So I want to ask for your advice. Would it better to remove the support for returned value tracing? Are there better option that I do not know without breaking the original feature

Finally, I confine the analysis the inherent defect for return value analysis on my initial comment:

  • command might never return from execute_command() which leads to the miss.
  • command might only return once from execute_command() for a multi-line code block which contains multiple command.

Best wishes!

@cfc4n
Copy link
Member

cfc4n commented Feb 28, 2024

Also the fix option I proposed (i.e., removing the support for returned value tracking) is a breaking change( removing a feature ), which may significantly contradicts your original design decision and the use case for eCapture.

Indeed, the original intention of eCapture's design was to track "input commands" without including return values. However, it does not mean that "return value tracking cannot be supported".

However, the current main goal is to determine the root cause of the problem.

@ruitianzhong
Copy link
Contributor Author

ruitianzhong commented Feb 29, 2024

In a word, the root cause is that execute_command() in bash might not return for every line of bash input (like example 1 and 2). However, eCapture just reads one line of command each time when hooking internal_readline_teardown() and expects this line command to return in execute_command() in order to print it out . If execute_command() do not return the next line, hooking function for internal_readline_teardown() will replace the previous one and the previous line is lost(never printed out).

@cfc4n
Copy link
Member

cfc4n commented Mar 2, 2024

So, what is your solution? Remove the detection of execute_command?

@sancppp, what do you think about this issue?

@cfc4n cfc4n added the help wanted Extra attention is needed label Mar 2, 2024
@ruitianzhong
Copy link
Contributor Author

ruitianzhong commented Mar 2, 2024

I come up with three potential solution with different consideration:

  • Remove the detection of execute_command with every interactive command recorded . However, in this way, we could not tracking returned value any more.
  • Stay the same with the risk of losing some commands when auditing (I don't know the tolerance for losing interactive commands in real auditing )
  • Send a event(including info like cmdline, pid ) to userspace when return from readline function and sending another event when returned from execute_command. When receiving event from execute_command, we just print out all commands stored before according to pid and print the final return value we capture with best effort. For simple command, it works without losing track of every command and returned value. For command containing something like if, it will lost track on some returned value in the code blocks. Commands like exec ls would be missed in this solutiion. If we could track the function related to exec internal command in Bash, exec ls command line still can be tracked and no command will be lost.

@cfc4n
Copy link
Member

cfc4n commented Mar 2, 2024

  1. Reported incorrect and unsuccessful executed events.
  2. Missed some events.
  3. More complex changes.

I agree with 3, but it may require extensive testing.

@sancppp
Copy link
Contributor

sancppp commented Mar 2, 2024

So, what is your solution? Remove the detection of execute_command?

@sancppp, what do you think about this issue?

I think execute_command() cannot be used as a hook position for eBPF.
It is incorrect to cache input content in a bpf map and wait for return value from the execute_command function.
Hack:

QQ20240302-214301.mp4

@ruitianzhong
Copy link
Contributor Author

ruitianzhong commented Mar 2, 2024

So, what is your solution? Remove the detection of execute_command?
@sancppp, what do you think about this issue?

I think execute_command() cannot be used as a hook position for eBPF. It is incorrect to cache input content in a bpf map and wait for return value from the execute_command function. Hack:

QQ20240302-214301.mp4

Another great example that shows the difficulties to correctly trace both every the returned value and command line.

@sancppp
Copy link
Contributor

sancppp commented Mar 2, 2024

in my opinion:

  • The number of built-in commands without returned value (such as exit, exec) is limited, perhaps they can be treated specially.
  • The uretprobe/readline function should check whether the key already exists, and if so, perform an append operation (in fact, this logic should be done in user space rather than in kernel space. The BPF program merely sends the collected data to user space.).

This was referenced Mar 4, 2024
ruitianzhong added a commit to ruitianzhong/ecapture that referenced this issue Mar 7, 2024
Signed-off-by: ruitianzhong <ruitian-zhong@outlook.com>
cfc4n pushed a commit that referenced this issue Mar 8, 2024
* tentative commit to address bash problem #490

Signed-off-by: ruitianzhong <ruitian-zhong@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants