Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add vdso to the blocklist in README.md and lib.rs #111

Merged
merged 1 commit into from
Mar 21, 2022

Conversation

YangKeao
Copy link
Member

Signed-off-by: YangKeao yangkeao@chunibyo.icu

Signed-off-by: YangKeao <yangkeao@chunibyo.icu>
Copy link
Contributor

@sticnarf sticnarf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@YangKeao YangKeao merged commit e6b4886 into tikv:master Mar 21, 2022
@mstange
Copy link

mstange commented Apr 21, 2022

In what way is the vdso unwind information wrong on Ubuntu 18.04? Do you have an example of an address which unwinds incorrectly?

@breezewish
Copy link
Member

@mornyx Do you have some minimal reproduce example for #111 (comment) ?

@mornyx
Copy link
Contributor

mornyx commented Apr 22, 2022

@mornyx Do you have some minimal reproduce example for #111 (comment) ?

It's hard to implement a demo which can be reproduced stably, but we can try to use gimli-rs to parse the CFI of vdso, gimli-rs does tell us that the data in .eh_frame_hdr or .eh_frame has some errors.

Use the example below:

use fallible_iterator::FallibleIterator;
use gimli::{BaseAddresses, CieOrFde, EhFrame, EhFrameHdr, NativeEndian, Pointer, UnwindSection};
use std::slice;

const PF_X: u32 = 1;

#[derive(Default, Debug)]
struct SectionInfo {
    base: u64,
    text: u64,
    text_len: u64,
    eh_frame_hdr: u64,
    eh_frame_hdr_len: u64,
    max_addr: u64,
}

fn main() {
    let mut info = SectionInfo::default();
    unsafe {
        libc::dl_iterate_phdr(Some(callback), &mut info as *mut _ as *mut libc::c_void);
    }
    if info.eh_frame_hdr == 0 {
        panic!("not found");
    }

    let address = BaseAddresses::default()
        .set_text(info.text)
        .set_eh_frame_hdr(info.eh_frame_hdr);
    let eh_frame_hdr_data = unsafe { slice::from_raw_parts(info.eh_frame_hdr as _, info.eh_frame_hdr_len as _) };
    let eh_frame_hdr = EhFrameHdr::new(eh_frame_hdr_data, NativeEndian)
        .parse(&address, 8)
        .unwrap();
    let eh_frame_ptr = match eh_frame_hdr.eh_frame_ptr() {
        Pointer::Direct(v) => v,
        Pointer::Indirect(v) => unsafe { *(v as *const u64) },
    };

    let address = address.set_eh_frame(eh_frame_ptr);
    let eh_frame_data = unsafe { slice::from_raw_parts(eh_frame_ptr as _, info.max_addr as _) };
    let eh_frame = EhFrame::new(eh_frame_data, NativeEndian);
    for entry in eh_frame.entries(&address).iterator() {
        let entry = entry.unwrap();
        match entry {
            CieOrFde::Cie(_) => {
                println!("cie");
            }
            CieOrFde::Fde(_) => {
                println!("fde");
            }
        }
    }
}

extern "C" fn callback(info: *mut libc::dl_phdr_info, _size: libc::size_t, data: *mut libc::c_void) -> libc::c_int {
    unsafe {
        let mut data = data as *mut SectionInfo;
        match std::ffi::CStr::from_ptr((*info).dlpi_name).to_str() {
            Ok(name) => {
                if !name.contains("vdso") {
                    return 0;
                }
            }
            Err(_) => return 0,
        }
        (*data).base = (*info).dlpi_addr;
        let hdrs = slice::from_raw_parts((*info).dlpi_phdr, (*info).dlpi_phnum as usize);
        for hdr in hdrs {
            match hdr.p_type {
                libc::PT_LOAD => {
                    if hdr.p_flags & PF_X != 0 {
                        (*data).text = (*info).dlpi_addr + hdr.p_vaddr;
                        (*data).text_len = hdr.p_memsz;
                    }
                    let max_addr = (*info).dlpi_addr + hdr.p_vaddr + hdr.p_filesz;
                    if (*data).max_addr < max_addr {
                        (*data).max_addr = max_addr;
                    }
                }
                libc::PT_GNU_EH_FRAME => {
                    (*data).eh_frame_hdr = (*info).dlpi_addr + hdr.p_vaddr;
                    (*data).eh_frame_hdr_len = hdr.p_memsz;
                }
                _ => {}
            }
        }
        0
    }
}

We will get:

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: UnexpectedEof(ReaderOffsetId(281473204903936))', src/main.rs:32:10

When we replace if !name.contains("vdso") with if !name.contains("libc") or any other library's name, it will successfully output cies and fdes...

@mstange
Copy link

mstange commented Apr 22, 2022

Perfect, thank you!

@mstange
Copy link

mstange commented Apr 22, 2022

For future reference, I've archived the vdso64.so from Ubuntu 18.04.6 LTS here: https://storage.googleapis.com/profiler-get-symbols-fixtures/ubuntu-18.04-lts-vdso/vdso64.so
(I got it by downloading and unpacking the iso, and then finding it at /lib/modules/5.4.0-84-generic/vdso/vdso64.so.)

@mstange
Copy link

mstange commented Apr 22, 2022

Hmm, this file doesn't reproduce the problem. I will try the code you gave me on an actual Ubuntu 18.04 installation. Here is the code I tried.

@mstange
Copy link

mstange commented Apr 23, 2022

Ah, I think my mistake was using Ubuntu 18.04.6. On that system, the code you gave me does not panic. Can you remember the exact Ubuntu version you were seeing this on? The /lib/modules/4.15.0-20-generic/vdso/vdso64.so from this Ubuntu 18.04 iso seems fine too.

@mornyx
Copy link
Contributor

mornyx commented Apr 23, 2022

Can you remember the exact Ubuntu version you were seeing this on?

I tested it on Ubuntu 20.04 on aws, 18.04/20.04 on docker for mac (arm64) and they all panic. I was wondering if the problem could be related to dl_iterate_phdr..

CentOS7 on docker for mac (arm64) will also panic, but there is one detail: when calling dl_iterate_phdr on CentOS, the dlpi_name of vdso is an empty string, which needs to be distinguished from the executable itself.

@breezewish
Copy link
Member

breezewish commented Apr 23, 2022

I met the crash in Ubuntu 18.04 (on AWS, x86_64) by continuously pprof the TiKV (without the vdso skip) for some hours (ref). However calling dwarfdump for the vdso does not report erorrs. Skipping the vdso indeed resolve the crash.

@mstange
Copy link

mstange commented Apr 23, 2022

I was able to reproduce the panic in line 32 with the Ubuntu 20.04 aarch64 vdso (archived here)! Thanks for the additional information.
The reason it panics is that this vdso does not have an eh_frame_hdr section. It only has an eh_frame section. But it still has a PT_GNU_EH_FRAME ELF program header, with all values set to zero. So we set eh_frame_hdr_data to an empty slice.

I am not sure why libunwind segfaulted on the x86_64 vdso. So far I have not seen evidence of bad dwarf in it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants