You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using r2pipe to extract callgraph info from all the binaries in a given folder. For each binary I first open it, then run an "aaa" command and then extract the callgraph in r2 commands format with "agC*" command. Now, there is no specific issue per se, r2pipe works as intended but it takes quite a lot of time to run through all the binaries.
I've checked the examples folder on how to use r2pipe in batch, but the code there is somehow simplified.
I wonder what would be your suggestions on how to improve the code runtime.
For instance, do I really need to quit r2 after each file?
How to reproduce?
Here is my code:
binaries_list=os.listdir(binaries_dir)
batchsize=1000# execute files in batches of 1000total_count=len(binaries_list)
defparseglobalcallgraph(filename):
filepath=os.path.join(binaries_dir, filename)
r2=r2pipe.open(filepath,["-e io.cache=true"])
r2.cmd('aaa')
gcg=r2.cmd("agC*") # extract global call graph in r2 commands formatr2.quit()
hash_value=hashlib.md5(gcg.encode()).hexdigest()
return {'hash':hash_value, 'filename':filename}
foriinrange(0, len(binaries_list), batchsize):
batch=binaries_list[i:i+batchsize]
withPool(processes=10) aspool:
results=pool.imap(parseglobalcallgraph, batch)
pool.close()
forresinresults:
if (res['hash'] notinhash_db):
hash_db.add(res['hash'])
print(res['hash'])
else:
continue
Expected behavior
I'd expect it to be much faster but seems like I'm missing something.
Possible fix
Screenshots
Additional context
The text was updated successfully, but these errors were encountered:
r2pipe is slow, in part because of Python, in part because the way it reads the data from the pipe. you can use the native r2pipe by prefixing the filepath with ccall:// so it will use dlopen(r_core) and do direct C api calls. that will make the script at least 10 times faster.
You can help improving the r2pipe module and profiling that issue. other langs dont have this issue
Describe the issue
I'm using r2pipe to extract callgraph info from all the binaries in a given folder. For each binary I first open it, then run an "aaa" command and then extract the callgraph in r2 commands format with "agC*" command. Now, there is no specific issue per se, r2pipe works as intended but it takes quite a lot of time to run through all the binaries.
I've checked the examples folder on how to use r2pipe in batch, but the code there is somehow simplified.
I wonder what would be your suggestions on how to improve the code runtime.
For instance, do I really need to quit r2 after each file?
How to reproduce?
Here is my code:
Expected behavior
I'd expect it to be much faster but seems like I'm missing something.
Possible fix
Screenshots
Additional context
The text was updated successfully, but these errors were encountered: