Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jumping on uninitalized value #421

Closed
Kixiron opened this issue Jun 14, 2020 · 13 comments
Closed

Jumping on uninitalized value #421

Kixiron opened this issue Jun 14, 2020 · 13 comments

Comments

@Kixiron
Copy link

Kixiron commented Jun 14, 2020

While running valgrind on my program, I found this

==28112== Conditional jump or move depends on uninitialised value(s)
==28112==    at 0xB28252: html5ever::tree_builder::data::doctype_error_and_quirks (data.rs:157)
==28112==    by 0x682617: <html5ever::tree_builder::TreeBuilder<Handle,Sink> as html5ever::tokenizer::interface::TokenSink>::process_token (mod.rs:472)
==28112==    by 0x6D0AA8: _ZN9html5ever9tokenizer21Tokenizer$LT$Sink$GT$13process_token17ha27fb7b70a53cc47E.llvm.10445680363282595522 (mod.rs:237)
==28112==    by 0x6D926A: html5ever::tokenizer::Tokenizer<Sink>::step (mod.rs:0)
==28112==    by 0x6D390A: _ZN9html5ever9tokenizer21Tokenizer$LT$Sink$GT$3run17h29926122ee9878e2E.llvm.10445680363282595522 (mod.rs:369)
==28112==    by 0x62196F: feed<html5ever::tree_builder::TreeBuilder<kuchiki::tree::NodeRef, kuchiki::parser::Sink>> (mod.rs:224)
==28112==    by 0x62196F: <html5ever::driver::Parser<Sink> as tendril::stream::TendrilSink<tendril::fmt::UTF8>>::process (driver.rs:109)
==28112==    by 0x5E9504: process<html5ever::driver::Parser<kuchiki::parser::Sink>,tendril::tendril::NonAtomic> (stream.rs:180)
==28112==    by 0x5E9504: one<tendril::stream::Utf8LossyDecoder<html5ever::driver::Parser<kuchiki::parser::Sink>, tendril::tendril::NonAtomic>,tendril::fmt::Bytes,tendril::tendril::NonAtomic,&[u8]> (stream.rs:48)
==28112==    by 0x5E9504: cratesfyi::utils::html::extract_head_and_body (html.rs:28)
==28112==    by 0x76F641: cratesfyi::web::rustdoc::rustdoc_html_server_handler (rustdoc.rs:312)
==28112==    by 0x751A1B: call<fn(&mut iron::request::Request) -> core::result::Result<iron::response::Response, iron::error::IronError>,(&mut iron::request::Request)> (function.rs:72)
==28112==    by 0x751A1B: <F as iron::middleware::Handler>::handle (mod.rs:312)
==28112==    by 0xA3ED50: <alloc::boxed::Box<dyn iron::middleware::Handler> as iron::middleware::Handler>::handle (mod.rs:318)
==28112==    by 0x7009C6: <cratesfyi::web::metrics::RequestRecorder as iron::middleware::Handler>::handle (metrics.rs:185)
==28112==    by 0xA3ED50: <alloc::boxed::Box<dyn iron::middleware::Handler> as iron::middleware::Handler>::handle (mod.rs:318)
==28112==  Uninitialised value was created by a stack allocation
==28112==    at 0xB27BC0: html5ever::tree_builder::data::doctype_error_and_quirks (data.rs:91)

gdb then pointed out the source of the problem, which is this snippet

html5ever::tree_builder::data::doctype_error_and_quirks (doctype=<optimized out>, iframe_srcdoc=false) at ~/.cargo/registry/src/github.com-1ecc6299db9ec823/html5ever-0.25.1/src/tree_builder/data.rs:157
157             (_, Some(ref s)) if QUIRKY_SYSTEM_MATCHES.contains(s) => Quirks,
@jdm
Copy link
Member

jdm commented Jun 14, 2020

@Kixiron Do you have a public codebase that reproduces the error that I can refer to?

@jdm
Copy link
Member

jdm commented Jun 14, 2020

I've tried running https://github.com/jyn514/html5ever-example/blob/master/src/struct.CaptureMatches.html through a kuchiki example program under valgrind, and I don't see the error you reported. A minimal program that reproduces the error would be most useful for me.

@Kixiron
Copy link
Author

Kixiron commented Jun 14, 2020

The code's all on this branch, I made a minimal reproduction with this being all the parse-related code. For reproduction steps I simply ran cargo build --release --bin html5ever and then valgrind target/release/html5ever

@Kixiron
Copy link
Author

Kixiron commented Jun 14, 2020

Me running that binary gives this output

==9592== Conditional jump or move depends on uninitialised value(s)
==9592==    at 0x1B4CF2: html5ever::tree_builder::data::doctype_error_and_quirks (data.rs:157)
==9592==    by 0x168607: <html5ever::tree_builder::TreeBuilder<Handle,Sink> as html5ever::tokenizer::interface::TokenSink>::process_token (mod.rs:472)
==9592==    by 0x187348: _ZN9html5ever9tokenizer21Tokenizer$LT$Sink$GT$13process_token17ha27fb7b70a53cc47E.llvm.10445680363282595522 (mod.rs:237)
==9592==    by 0x18FB0A: html5ever::tokenizer::Tokenizer<Sink>::step (mod.rs:0)
==9592==    by 0x18A1AA: _ZN9html5ever9tokenizer21Tokenizer$LT$Sink$GT$3run17h29926122ee9878e2E.llvm.10445680363282595522 (mod.rs:369)
==9592==    by 0x15D78F: feed<html5ever::tree_builder::TreeBuilder<kuchiki::tree::NodeRef, kuchiki::parser::Sink>> (mod.rs:224)
==9592==    by 0x15D78F: <html5ever::driver::Parser<Sink> as tendril::stream::TendrilSink<tendril::fmt::UTF8>>::process (driver.rs:109)
==9592==    by 0x158024: process<html5ever::driver::Parser<kuchiki::parser::Sink>,tendril::tendril::NonAtomic> (stream.rs:180)
==9592==    by 0x158024: one<tendril::stream::Utf8LossyDecoder<html5ever::driver::Parser<kuchiki::parser::Sink>, tendril::tendril::NonAtomic>,tendril::fmt::Bytes,tendril::tendril::NonAtomic,&[u8]> (stream.rs:48)
==9592==    by 0x158024: cratesfyi::utils::html::extract_head_and_body (html.rs:28)
==9592==    by 0x1554F1: html5ever::main (html5ever.rs:4)
==9592==    by 0x154242: std::rt::lang_start::{{closure}} (rt.rs:67)
==9592==    by 0x1E42C7: {{closure}} (rt.rs:52)
==9592==    by 0x1E42C7: do_call<closure-0,i32> (panicking.rs:331)
==9592==    by 0x1E42C7: try<i32,closure-0> (panicking.rs:274)
==9592==    by 0x1E42C7: catch_unwind<closure-0,i32> (panic.rs:394)
==9592==    by 0x1E42C7: std::rt::lang_start_internal (rt.rs:51)
==9592==    by 0x155687: main (in /mnt/g/Users/Chase/Code/Rust/docs.rs/target/release/html5ever)

@jdm
Copy link
Member

jdm commented Jun 14, 2020

Oh, interesting, I can reproduce the same problem in the kuchiki example when running with --release, but not with a debug build.

@Kixiron
Copy link
Author

Kixiron commented Jun 14, 2020

Same on my end, debug builds have no issues

@froydnj
Copy link

froydnj commented Jun 14, 2020

These sorts of errors are typically because LLVM has made valid optimizations but valgrind doesn’t/can’t understand that, so valgrind flags a potential error.

@jdm
Copy link
Member

jdm commented Jun 14, 2020

Also interesting - when I add a println above this line for opt_string_as_slice(&system) the valgrind error disappears. My suspicion that this is a compiler bug grows.

@jdm
Copy link
Member

jdm commented Jun 14, 2020

Rewriting the match as a series of equivalent if let and if statements does not change the valgrind output.

@jdm
Copy link
Member

jdm commented Jun 14, 2020

The following diff makes the error disappear:

diff --git a/html5ever/src/tree_builder/data.rs b/html5ever/src/tree_builder/data.rs
index 9d51a71..5e84200 100644
--- a/html5ever/src/tree_builder/data.rs
+++ b/html5ever/src/tree_builder/data.rs
@@ -147,14 +147,14 @@ pub fn doctype_error_and_quirks(doctype: &Doctype, iframe_srcdoc: bool) -> (bool
     let public = opt_to_ascii_lower(public);
     let system = opt_to_ascii_lower(system);
 
-    let quirk = match (opt_string_as_slice(&public), opt_string_as_slice(&system)) {
+    let quirk = match (opt_string_as_slice(&public), &system) {
         _ if doctype.force_quirks => Quirks,
         _ if name != Some("html") => Quirks,
 
         _ if iframe_srcdoc => NoQuirks,
 
         (Some(ref p), _) if QUIRKY_PUBLIC_MATCHES.contains(p) => Quirks,
-        (_, Some(ref s)) if QUIRKY_SYSTEM_MATCHES.contains(s) => Quirks,
+        (_, Some(ref s)) if QUIRKY_SYSTEM_MATCHES.contains(&&**s) => Quirks,
 
         (Some(p), _) if contains_pfx(QUIRKY_PUBLIC_PREFIXES, p) => Quirks,
         (Some(p), _) if contains_pfx(LIMITED_QUIRKY_PUBLIC_PREFIXES, p) => LimitedQuirks,

@jdm
Copy link
Member

jdm commented Jun 14, 2020

This is a minimal reproduction of the same problem. I'll try filing it on rustc and see if anything comes of it:

fn main() {
    let mut args = std::env::args();
    let s1 = args.next();
    let s2 = args.next();
    let a_matches = &["a"];
    let b_matches = &["b"];
    let a = s1.as_ref().map(|s| &s[..]);
    let b = s2.as_ref().map(|s| &s[..]);
    match (a, b) {
        (Some(ref a), _) if a_matches.contains(a) => println!("a"),
        (_, Some(ref b)) if b_matches.contains(b) => println!("b"),
        _ => println!("nothing"),
    }
}

@jdm
Copy link
Member

jdm commented Jun 14, 2020

Filed rust-lang/rust#73344.

@jdm
Copy link
Member

jdm commented Jun 16, 2020

Given rust-lang/rust#73344 (comment), I suspect the answer here is to update valgrind and leave the html5ever code unchanged.

@Kixiron Kixiron closed this as completed Jun 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants