Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] Allow tracking with bcc tools via openai function call. #8

Merged
merged 4 commits into from
Jul 8, 2023

Conversation

try-agaaain
Copy link
Member

Description

This PR adds a command-line parameter -b(--bcc), when the user uses this parameter, GPTtrace will select the appropriate command tool from bcc-bpftools and set the appropriate parameters to complete the tracking task, like this:

$./GPTtrace.py -v -b "count kernel stack traces for submit_bio"
Run:  sudo stackcount-bpfcc submit_bio 
Tracing 1 functions for "submit_bio"... Hit Ctrl-C to end.
^C
  b'submit_bio'
  b'ext4_io_submit'
  b'ext4_bio_write_page'
  b'mpage_submit_page'
  b'mpage_process_page_bufs'
  b'mpage_prepare_extent_to_map'
  b'ext4_writepages'
  b'do_writepages'
  b'__writeback_single_inode'
  b'writeback_sb_inodes'
  b'__writeback_inodes_wb'
  b'wb_writeback'
  b'wb_workfn'
  b'process_one_work'
  b'worker_thread'
  b'kthread'
  b'ret_from_fork'
    2

@try-agaaain
Copy link
Member Author

try-agaaain commented Jul 5, 2023

Problems that need to be solved

Here are some issues that I will fix later:

  • bcc-bpftool has 100+ tools, but now only about 40 can be used because openai limits the maximum number of function calls to 64, and the number of tokens will exceed 4096 when all function calls are passed.
  • The current project has a lot of code and needs to be re-structured to make the project code more clear.
  • Some commands have positional parameters. When using these commands, the name of the position parameter cannot be added in front of the position parameter, e.g:
    stackcount-bpfcc --pid 112--pattern "submit" [wrong way]
    stackcount-bpfcc --pid 112 "submit" [right way]
    Here pattern is the location parameter.
  • Documentation also needs to be updated.

@try-agaaain try-agaaain requested review from Officeyutong and yunwei37 and removed request for Officeyutong July 5, 2023 09:43
@try-agaaain
Copy link
Member Author

For the first problem, it is done in two steps:

  • step1:Pass the description information of 117 commands to LLM, and let LLM determine the 20 commands that are most likely to be used to solve the user's problem; in this query, it is not necessary to use function call
  • step2:Pass the 20 commands in step1 as function call, and let LLM decide which command to use to solve the user's problem.

For the third problem, I created a dictionary of positional parameters to determine which parameters of a given command are positional parameters, see https://github.com/try-agaaain/GPTtrace/blob/8b7c3eecdfeec04f0eb3eedba2e3526465374023/bcc_tools.py#L90

@yunwei37
Copy link
Member

yunwei37 commented Jul 6, 2023

why not generate the function call dynamically at runtime? the advantages and disadvantages?

And,maybe you can try combined this with agents in langchain later?
https://docs.langchain.com/docs/components/agents/

@try-agaaain
Copy link
Member Author

try-agaaain commented Jul 6, 2023

If I need to generate a function call dynamically, I first need to know which command to generate the function call, how should I determine this?

Dynamic generation has some drawbacks, as the function calls generated by LLM may not always be accurate. If we can predefine these function calls more accurately, it can reduce the occurrence of errors.

It doesn't seem to be more convenient to use Function Call in LangChain, but it's okay to change to LangChain.

@try-agaaain
Copy link
Member Author

Does it possible to merge this PR first? It already includes a lot of updates (might be a bit messy). @yunwei37 @Officeyutong

I have added a -c option where users can use GPTtrace -c opensnoop-bpfcc "help me trace the open syscall in pid 123" to utilize opensnoop-bpfcc. For example:

```console
$./GPTtrace.py -c memleak-bpfcc "Trace allocations and display each individual allocator function call"
 Run:  sudo memleak-bpfcc --trace 
Attaching to kernel allocators, Ctrl+C to quit.
(b'Relay(35)', 402, 6, b'd...1', 20299.252425, b'alloc exited, size = 4096, result = ffff8881009cc000')
(b'Relay(35)', 402, 6, b'd...1', 20299.252425, b'free entered, address = ffff8881009cc000, size = 4096')
(b'Relay(35)', 402, 6, b'd...1', 20299.252426, b'free entered, address = 588a6f, size = 4096')
(b'Relay(35)', 402, 6, b'd...1', 20299.252427, b'alloc entered, size = 4096')
(b'Relay(35)', 402, 6, b'd...1', 20299.252427, b'alloc exited, size = 4096, result = ffff8881009cc000')
(b'Relay(35)', 402, 6, b'd...1', 20299.252428, b'free entered, address = ffff8881009cc000, size = 4096')
(b'sudo', 6938, 10, b'd...1', 20299.252437, b'alloc entered, size = 2048')
(b'sudo', 6938, 10, b'd...1', 20299.252439, b'alloc exited, size = 2048, result = ffff88822e845800')
(b'node', 410, 18, b'd...1', 20299.252455, b'alloc entered, size = 256')
(b'node', 410, 18, b'd...1', 20299.252457, b'alloc exited, size = 256, result = ffff8882e9b66400')
(b'node', 410, 18, b'd...1', 20299.252458, b'alloc entered, size = 2048')

For bcc tools, it looks up the corresponding function call from funcs.json. For other tools not defined in funcs.json, the LLM dynamically generates the function call.

In the next PR, I will attempt to accomplish the tracing task using a method similar to autogpt. #10

@Officeyutong
Copy link
Collaborator

Some suggestions:

@try-agaaain
Copy link
Member Author

Some suggestions:

Great suggestion! In the last submission, I added documentation for each function. The descriptions in the documentation may not be detailed enough, but it will be improved in the future.

@try-agaaain try-agaaain merged commit 606a7a3 into eunomia-bpf:main Jul 8, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants