Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

elf parsing should use buffered reads #907

Closed
tiehuis opened this issue Apr 10, 2018 · 3 comments · Fixed by #2482
Closed

elf parsing should use buffered reads #907

tiehuis opened this issue Apr 10, 2018 · 3 comments · Fixed by #2482
Assignees
Labels
enhancement Solving this issue will likely involve adding new logic or components to the codebase. standard library This issue involves writing Zig code for the standard library.
Milestone

Comments

@tiehuis
Copy link
Member

tiehuis commented Apr 10, 2018

pub fn main() void {
    @panic("error");
}
zig build-exe example.zig
strace -c ./example
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 99.45    0.242440           2    116710           read
  0.38    0.000934           3       283           write
  0.08    0.000204           2        84           lseek
  0.05    0.000123          25         5           open
  0.02    0.000052          52         1           execve
  0.00    0.000012          12         1           fstat
  0.00    0.000008           2         4           close
  0.00    0.000005           3         2           rt_sigprocmask
  0.00    0.000004           4         1           tkill
  0.00    0.000003           3         1           ioctl
  0.00    0.000003           3         1           gettid
------ ----------- ----------- --------- --------- ----------------
100.00    0.243788                117093           total

We are using unbuffered file input for reads. When performing a stack trace we need to parse the elf for information and this is doing it one byte at a time resulting in an enormous number of read syscalls.

We should be able to use io.BufferedInStream, we just need to reset the input buffer if we perform a seek.

I wonder if it would be beneficial it would be to default to buffering with an io.FileInStream and have an explicit opt-out io.UnbufferedFileInStream since this particular use case may be common enough?

@tiehuis tiehuis added the enhancement Solving this issue will likely involve adding new logic or components to the codebase. label Apr 10, 2018
@andrewrk andrewrk added this to the 0.4.0 milestone Apr 12, 2018
@andrewrk
Copy link
Member

The other thing I realized is that, on Linux, when a binary reads its own ELF file, Linux actually maps it into memory before execution. We can use the Linux AUX vector to find our own ELF file in memory and read that directly.

@andrewrk
Copy link
Member

I used the above idea when implementing MacOS stack traces in #1365 and it worked.

@andrewrk
Copy link
Member

However, it empirically appears that Linux does not map the .debug_info section & friends into memory automatically. So this technique will not in fact work for Linux. There is, however, the question of whether mmapping the executable is appropriate. I would argue that you can trust your own executable's bits, and I would further argue that in a stack trace scenario, the failure modes of mmap are acceptable.

@andrewrk andrewrk added the standard library This issue involves writing Zig code for the standard library. label Aug 25, 2018
@andrewrk andrewrk modified the milestones: 0.4.0, 0.5.0 Feb 7, 2019
@tiehuis tiehuis self-assigned this May 13, 2019
tiehuis added a commit that referenced this issue May 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Solving this issue will likely involve adding new logic or components to the codebase. standard library This issue involves writing Zig code for the standard library.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants