elf parsing should use buffered reads #907

tiehuis · 2018-04-10T05:27:05Z

pub fn main() void {
    @panic("error");
}

zig build-exe example.zig
strace -c ./example

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 99.45    0.242440           2    116710           read
  0.38    0.000934           3       283           write
  0.08    0.000204           2        84           lseek
  0.05    0.000123          25         5           open
  0.02    0.000052          52         1           execve
  0.00    0.000012          12         1           fstat
  0.00    0.000008           2         4           close
  0.00    0.000005           3         2           rt_sigprocmask
  0.00    0.000004           4         1           tkill
  0.00    0.000003           3         1           ioctl
  0.00    0.000003           3         1           gettid
------ ----------- ----------- --------- --------- ----------------
100.00    0.243788                117093           total

We are using unbuffered file input for reads. When performing a stack trace we need to parse the elf for information and this is doing it one byte at a time resulting in an enormous number of read syscalls.

We should be able to use io.BufferedInStream, we just need to reset the input buffer if we perform a seek.

I wonder if it would be beneficial it would be to default to buffering with an io.FileInStream and have an explicit opt-out io.UnbufferedFileInStream since this particular use case may be common enough?

The text was updated successfully, but these errors were encountered:

andrewrk · 2018-08-10T20:44:48Z

The other thing I realized is that, on Linux, when a binary reads its own ELF file, Linux actually maps it into memory before execution. We can use the Linux AUX vector to find our own ELF file in memory and read that directly.

andrewrk · 2018-08-25T06:23:36Z

I used the above idea when implementing MacOS stack traces in #1365 and it worked.

andrewrk · 2018-08-25T07:03:05Z

However, it empirically appears that Linux does not map the .debug_info section & friends into memory automatically. So this technique will not in fact work for Linux. There is, however, the question of whether mmapping the executable is appropriate. I would argue that you can trust your own executable's bits, and I would further argue that in a stack trace scenario, the failure modes of mmap are acceptable.

Closes #907.

tiehuis added the enhancement Solving this issue will likely involve adding new logic or components to the codebase. label Apr 10, 2018

andrewrk added this to the 0.4.0 milestone Apr 12, 2018

andrewrk added the standard library This issue involves writing Zig code for the standard library. label Aug 25, 2018

andrewrk modified the milestones: 0.4.0, 0.5.0 Feb 7, 2019

tiehuis mentioned this issue May 9, 2019

std.debug.scanAllFunctions is too slow; stack traces are too slow for large binaries #2458

Open

tiehuis self-assigned this May 13, 2019

tiehuis added a commit that referenced this issue May 13, 2019

Mmap debug info on linux

c4d1597

Closes #907.

tiehuis mentioned this issue May 13, 2019

Mmap debug info on linux #2482

Merged

tiehuis closed this as completed in #2482 May 14, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

elf parsing should use buffered reads #907

elf parsing should use buffered reads #907

tiehuis commented Apr 10, 2018

andrewrk commented Aug 10, 2018

andrewrk commented Aug 25, 2018

andrewrk commented Aug 25, 2018

elf parsing should use buffered reads #907

elf parsing should use buffered reads #907

Comments

tiehuis commented Apr 10, 2018

andrewrk commented Aug 10, 2018

andrewrk commented Aug 25, 2018

andrewrk commented Aug 25, 2018