New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

elf parsing should use buffered reads #907

Open
tiehuis opened this Issue Apr 10, 2018 · 3 comments

Comments

Projects
None yet
2 participants
@tiehuis
Member

tiehuis commented Apr 10, 2018

pub fn main() void {
    @panic("error");
}
zig build-exe example.zig
strace -c ./example
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 99.45    0.242440           2    116710           read
  0.38    0.000934           3       283           write
  0.08    0.000204           2        84           lseek
  0.05    0.000123          25         5           open
  0.02    0.000052          52         1           execve
  0.00    0.000012          12         1           fstat
  0.00    0.000008           2         4           close
  0.00    0.000005           3         2           rt_sigprocmask
  0.00    0.000004           4         1           tkill
  0.00    0.000003           3         1           ioctl
  0.00    0.000003           3         1           gettid
------ ----------- ----------- --------- --------- ----------------
100.00    0.243788                117093           total

We are using unbuffered file input for reads. When performing a stack trace we need to parse the elf for information and this is doing it one byte at a time resulting in an enormous number of read syscalls.

We should be able to use io.BufferedInStream, we just need to reset the input buffer if we perform a seek.

I wonder if it would be beneficial it would be to default to buffering with an io.FileInStream and have an explicit opt-out io.UnbufferedFileInStream since this particular use case may be common enough?

@tiehuis tiehuis added the enhancement label Apr 10, 2018

@andrewrk andrewrk added this to the 0.4.0 milestone Apr 12, 2018

@andrewrk

This comment has been minimized.

Member

andrewrk commented Aug 10, 2018

The other thing I realized is that, on Linux, when a binary reads its own ELF file, Linux actually maps it into memory before execution. We can use the Linux AUX vector to find our own ELF file in memory and read that directly.

@andrewrk

This comment has been minimized.

Member

andrewrk commented Aug 25, 2018

I used the above idea when implementing MacOS stack traces in #1365 and it worked.

@andrewrk

This comment has been minimized.

Member

andrewrk commented Aug 25, 2018

However, it empirically appears that Linux does not map the .debug_info section & friends into memory automatically. So this technique will not in fact work for Linux. There is, however, the question of whether mmapping the executable is appropriate. I would argue that you can trust your own executable's bits, and I would further argue that in a stack trace scenario, the failure modes of mmap are acceptable.

@andrewrk andrewrk added the userland label Aug 25, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment