-
Notifications
You must be signed in to change notification settings - Fork 261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
i#1948 add libcall args sym printing: syscall args support #1975
Conversation
Added arguments printing of already known syscalls from drsyscall's tables for Windows and Linux.
I am planning to use the same syscall_info_t structure to describe other libcalls. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to use the drsyscall library through public interfaces, rather than breaking abstractions and using its private, internal tables. It sounds like adding one simple interface would eliminate the need for all of the uses of private members here, at least at first glance.
drltrace/drltrace_frontend.cpp
Outdated
(DROPTION_SCOPE_CLIENT, "all_args", 2, "Number of library call arguments to print", | ||
"Number of library call arguments to print. Specify 0 to disable. Drltrace will print " | ||
"symbolic representation of arguments in case of known library call including all " | ||
"known arguments."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description is confusing: so if the actual arguments are known, they will all be printed and this option ignored, right? So then this should option should be named "num_unknown_args" or something, right? And the short description should have "unknown" in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the idea was simple: if user wants to print arguments, he would use -all_args and in this case drltrace will provide as much as possible information. If user doesn't want to print he can specify 0 and disable arguments printing at all (for example to increase performance).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest splitting into two options: one that controls the number of unknown args printed ("num_unknown _args" or sthg), and another that is either a max on args of any type ("max_args" or sthg) or maybe just a boolean controlling whether any arg printing is done (the max seems more useful).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure that I understand. For example user specified -num_unknown_args 3 -max_args 7.
Should we print 3 arguments for unknown library calls and 7 arguments for known libcalls or 7 arguments for both types?
What should we have by default for -max_args?
Boolean controlling looks more reasonable for me. For example user can specify -num_unknown_args 5 -print_known_args. So we know that we should print everything for known system calls and maximum 5 arguments for unknown. By default we may set -num_unknown_args to be 0 and -print_known_args to be true ? (or may be false to demonstrate maximum performance by default).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"-num_unknown_args 3 -max_args 7" means never print more than 7, and furthermore don't print more than 3 when the types and # of args are not known.
Definitely there is a use case where no args should be printed. The question is whether some users would want to limit the # of args to some small number for cleaner output: see some but not all. On Linux there are never more than 6, but on Windows where there can be over 20 it may be a different story.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I got it, will implement it this way
drltrace/drltrace.c
Outdated
#elif MACOS | ||
for (i = 0; i < count_syscall_info_bsd; i++) | ||
hashtable_add_new_name_entry(syscall_info_bsd[i], &syscall_info_bsd[i]); | ||
#else /* LINUX */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Safer to have #elif defined(LINUX) and #else #error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
drltrace/drltrace.c
Outdated
hashtable_add_new_name_entry(syscall_gdi32_info[i], &syscall_gdi32_info[i]); | ||
for (i = 0; i < num_ntdll_syscalls(); i++) | ||
hashtable_add_new_name_entry(syscall_ntdll_info[i], &syscall_ntdll_info[i]); | ||
#elif MACOS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defined(MACOS)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
drltrace/drltrace.c
Outdated
print_symbolic_args(func, wrapcxt); | ||
|
||
/* XXX: we have to print a module id + offset here instead of an | ||
* absolute address and provide a dump a module list at the end of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"provide a dump a"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
drltrace/drltrace.c
Outdated
dr_fprintf(outf, ")"); | ||
} | ||
|
||
void print_symbolic_args(app_pc func, void *wrapcxt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style violation again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
drltrace/drltrace.c
Outdated
return (syscall_info_t *)hashtable_lookup(&addr2entry_table, func); | ||
} | ||
|
||
void print_args(app_pc func, void *wrapcxt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
drltrace/drltrace.c
Outdated
return true; /* keep going */ | ||
} | ||
|
||
syscall_info_t *lookup_entry(app_pc func) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style violation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
drltrace/drltrace.c
Outdated
return true; /* keep going */ | ||
} | ||
|
||
syscall_info_t *lookup_entry(app_pc func) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
drltrace/drltrace.c
Outdated
libcall_entry = lookup_entry(func); | ||
if (libcall_entry != NULL) { | ||
/* drsys_iterate_arg_types employs static enumeration. */ | ||
res = drsys_iterate_arg_types((drsys_syscall_t *)libcall_entry, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This breaks the abstraction barrier of the drsyscall library: drsys_syscall_t is an opaque type and a user of the library should not have assumptions about what it looks like. See the suggestion above for using library interfaces instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, implemented a new function: drsys_iterate_arg_types_by_sysname
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO it's cleaner to separate things and have a lookup whose output can be used to the existing iterator, rather than having a multi-operation iterator.
BTW, if drsys_name_to_syscall() can be used statically, which should be true, then no new interface function is needed, right? Call drsys_name_to_syscall() and pass the resulting handle to drsys_iterate_arg_types().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense, what about hashtables initialization ? we have drsyscall_os_init but I need to provide a context for this function...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean: any hashtables inside drsyscall are invisible to the user here and are initialized when drsyscall_init() is called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, drsys_init()/drsys_exit() will do everything. So it works perfect the only one problem is ASSERT at lines 252-257, it raises error when I am sending Zw* syscall. I can replace Zw* with Nt* but probably I am missing something ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ASSERT at lines 252-257 in drsyscall.c
drltrace/drltrace.c
Outdated
|
||
/* XXX: We need to add support of printing from secondary table (e.g. ioctl) (xref i#1549) */ | ||
#ifdef WINDOWS | ||
extern dr_os_version_info_t win_ver; /* already defined in drsyscall_windows.c */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of these are exported by the drsyscall library, though. They're not part of the library interface! All uses here should be through interfaces.
It sounds like you could add a static drsys_lookup_syscall() routine that takes in a name and returns a drsys_syscall_t "handle", right? Then there's no need to break the library's abstractions in multiple ways here by using its internal, private data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, I've created three functions that user can use externally.
Added arguments printing of already known syscalls from drsyscall's tables for Windows and Linux. Added two new arguments for drltrace -num_unknown_args and max_args to limit maximum arguments to be printed for known and unknown libcalls and syscalls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks much better. I have a few minor comments/suggestions.
drltrace/drltrace.c
Outdated
@@ -62,7 +62,8 @@ | |||
* + Add argument values and return values. The number and type of each | |||
* argument and return would likely come from the filter configuration | |||
* file, or from querying debug information. | |||
* Today we have simple type-blind printing via -all_args. | |||
* Today we have simple type-blind printing via -all_args and usage of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
num_unknown_args
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
drltrace/drltrace.c
Outdated
#define MAX_LIBCALL_NAME 255 | ||
|
||
/* XXX: The functions print_simple_value and print_arg were taken from drstrace. | ||
* It would be better to move them in drsyscall and import in drstrace and here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is a lot of code to duplicate. Is there an issue # for the XXX?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
static void | ||
print_arg(drsys_arg_t *arg) | ||
{ | ||
dr_fprintf(outf, "\n arg %d: ", arg->ordinal); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This format is completely different from the unknown-arg format: the unknown is all one one line, separated by commas. This has a separate line per arg. Shouldn't they be consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, do we want to print unknown args one per line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am worry about log size here, compact unknown args printing scheme saves a lot of space...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want compact, should the known args be one line (x, y, z, ...) then, closer to Linux ltrace and strace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case of known I think we have pretty good scheme of printing. One line would be a bit complicated to read in case of known arguments while for unknown arguments it looks ok. I would prefer to leave current scheme for now...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like log size or screen space is not a concern, then.
Having different formats complicates post-processing the log: now to parse the file there are two different formats to match.
It also seems odd to have a syscall with a known name but unknown types (e.g., NtCreateEnclave) have its 9 arguments all on separate lines with unknown types, while a syscall with an unknown name is printed completely differently even though its arg types are also unknown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense, ok let's do one arg per line for all schemes.
drltrace/drltrace.c
Outdated
if (name[0] == 'Z' && name[1] == 'w') | ||
dr_snprintf(syscall_name, BUFFER_SIZE_ELEMENTS(syscall_name), "Nt%s", name+2); | ||
else | ||
dr_snprintf(syscall_name, BUFFER_SIZE_ELEMENTS(syscall_name), "%s", name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could avoid this redundant copy by having a pointer that points either at name or syscall_name
drltrace/drltrace.c
Outdated
else | ||
dr_snprintf(syscall_name, BUFFER_SIZE_ELEMENTS(syscall_name), "%s", name); | ||
|
||
res = drsys_name_to_syscall(syscall_name, &syscall); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or should we change this routine to succeed with the Zw variant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want additional flag e.g. ignore_zw_nt_mismatch or sthg like that ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought drsyscall had entries for both the Nt and Zw forms. Are you sure it doesn't? I thought that today either query would work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see code in drsyscall_windows.c that is adding duplicate entries for Zw to the name2num_table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is correct both versions work. We have an assert at https://github.com/DynamoRIO/drmemory/blob/master/drsyscall/drsyscall.c#L251
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NtQueryPerformanceCounter ZwQueryPerformanceCounter
ASSERT FAILURE (thread 2760): C:\drmemory\drmemory\drsyscall\drsyscall.c:257: st
ri_eq(sysinfo->name, name) IF_WINDOWS(|| strcasestr(sysinfo->name, name) != NULL
) (name<->num mismatch)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assert should be relaxed of course.
drltrace/drltrace.c
Outdated
|
||
/* XXX: We employ two schemes of arguments printing. drsyscall is used | ||
* to get a symbolic representation of arguments for known library calls. | ||
* For the rest of library calls we use type-blind printing and -unknown_args |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
num_
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
drltrace/drltrace.c
Outdated
*/ | ||
print_symbolic_args(name, wrapcxt, func); | ||
|
||
/* XXX: we have to print a module id + offset here instead of an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i#?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created
drltrace/drltrace.c
Outdated
drsys_syscall_t *syscall; | ||
char syscall_name[MAX_LIBCALL_NAME]; | ||
|
||
if (options.max_args > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest: "if (options.max_args == 0) return;" to make the rest of the function un-indented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
"unknown args printing."); | ||
|
||
static droption_t<uint> op_max_args | ||
(DROPTION_SCOPE_CLIENT, "num_max_args", 6, "Maximum number of arguments to print", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not ideal to duplicated these default values. The plan is to switch drltrace.c to use droption also and share these definitions directly, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm preparing a separate commit for that.
Added arguments printing of already known syscalls from drsyscall's tables for Windows and Linux. Added two new arguments for drltrace -num_unknown_args and max_args to limit maximum arguments to be printed for known and unknown libcalls and syscalls.
drsyscall/drsyscall.c
Outdated
#ifdef DEBUG | ||
ASSERT(stri_eq(sysinfo->name, name) | ||
/* ignore possible Nt/Zw mismatch */ | ||
if (((sysinfo->name[0] == 'N' && sysinfo->name[1] == 't') || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably be Windows only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will change in the next commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the next commit? It needs to update to master and thus re-run the checks anyway, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I'm assuming by "next commit" you mean tacking onto a future pull request? Seems cleaner to be part of this one, not an unrelated pull request, right?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, yes, just need to pass through the local tests.
There's no test of the output yet? We should add a test soon. |
Yes, it's important. |
Added arguments printing of already known syscalls from drsyscall's tables for Windows and Linux. Added two new arguments for drltrace -num_unknown_args and max_args to limit maximum arguments to be printed for known and unknown libcalls and syscalls.
Added arguments printing of already known syscalls from drsyscall's
tables for Windows and Linux.