Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Easily panics on untrusted input #3

Open
ssokolow opened this issue Aug 11, 2022 · 0 comments
Open

Easily panics on untrusted input #3

ssokolow opened this issue Aug 11, 2022 · 0 comments

Comments

@ssokolow
Copy link

ssokolow commented Aug 11, 2022

I was trying to write the differential fuzzing harness I said I'd write in #2, but, once I got it to compile, I didn't even get a chance to make sure I'd implemented it correctly because I ran into panics from unexpected input.

Generally, when working with parsing code, you want to use .get(x..y) for your slicing rather than [x..y], so you can map them to your usual DecodeError if the structure of the data isn't as expected rather than hoping that the person using your module has wrapped calls to it in catch_unwind and isn't using panic=abort.

(In fact, just in general for my own code, I use #![warn(clippy:indexing_slicing)] so the compiler will complain if I forget to use .get())

I saw three different panics from three different runs (cargo-fuzz stops at the first failure), but the shortest of them was [84, 58, 58] (T:: in ASCII).

If you want to experiment with the draft fuzz harness without waiting for me to verify that the Python side is 100% correct, here's what I got so far. (I had to reimplement most of the hexbin function from the Python side because it only accepts file paths for the output and I don't want the fuzzer grinding away at my SSD's write cycles or depending on a path to a RAM drive.)

#![no_main]
use binhex4::decode::hexbin;
use libfuzzer_sys::fuzz_target;
use pyo3::prelude::*;

fuzz_target!(|data: &[u8]| {
    // Convert to Option<T> as a cheap hack to avoid having to manually map the Rust and Python
    // Err(T) states into things that assert_eq! will see as equal when they're equivalent.
    //
    // (i.e. Only care about success vs. failure and what content success returns)
    //
    // ...and work around awkwardness in `HQX` not having something like `impl Into` to easily
    // move the `Vec<u8>` out so assert_eq! can easily compare Option<Vec<u8>>s.
    let rust_result = match hexbin(data, true) {
        Ok(mut hqx) => Some(std::mem::replace(&mut hqx.vec, Vec::new())),
        Err(_) => None,
    };

    // TODO: Initialize as much as possible in a once_cell for better throughput
    let py_result = Python::with_gil(|py| {
        let mut result = Vec::new();
        let py_binhex = py.import("binhex").ok()?;
        let py_io = py.import("io").ok()?;

        #[allow(non_snake_case)]
        let HexBin: Py<PyAny> = py_binhex.getattr("HexBin").ok()?.into();
        #[allow(non_snake_case)]
        let BytesIO: Py<PyAny> = py_io.getattr("BytesIO").ok()?.into();

        let ifp = HexBin.call1(py, (BytesIO.call1(py, (data,)).ok()?,)).ok()?;
        while let Ok(block) = ifp.call_method1(py, "read", (128000,)) {
            let bytes: Vec<u8> = block.extract(py).ok()?;
            result.extend(bytes);
        }
        ifp.getattr(py, "close_data").ok()?.call0(py).ok()?;

        // TODO: Look up how `binhex4` outputs resource forks and match that
        while let Ok(block) = ifp.call_method1(py, "read_rsrc", (128000,)) {
            let bytes: Vec<u8> = block.extract(py).ok()?;
            result.extend(bytes);
        }
        ifp.getattr(py, "close").ok()?.call0(py).ok()?;
        Some::<Vec<u8>>(result)
    });

    assert_eq!(rust_result, py_result);
});

Just follow https://tiemoko.com/blog/diff-fuzz/ to get cargo-fuzz set up and init'd for the project, run cargo add pyo3 --features auto-initialize in the fuzz subdirectory it generates, and then copy the following into the fuzz/fuzz_targets/fuzz_target_1.rs it created.

For codebases with no unsafe code and not doing any C FFI, I'd suggest cargo +nightly fuzz run fuzz_target_1 -s none for better throughput.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant