Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash when parsing an argument value that is \r\n #305

Closed
Eijebong opened this issue Aug 26, 2017 · 3 comments
Closed

Crash when parsing an argument value that is \r\n #305

Eijebong opened this issue Aug 26, 2017 · 3 comments

Comments

@Eijebong
Copy link
Contributor

Hi,

I was trying to fuzz ammonia to see if there was anything that would make it crash and found this string <a a=\r\n that makes html5ever panic.

Here is the test example:

test!(fuzz, "<a a=\r\n", "");

The panic/backtrace:

thread '<unnamed>' panicked at 'assertion failed: c.is_some()', /home/eijebong/.cargo/registry/src/github.com-1ecc6299db9ec823/html5ever-0.19.0/src/tokenizer/mod.rs:555:8
stack backtrace:
   0: std::sys::imp::backtrace::tracing::imp::unwind_backtrace
             at /checkout/src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
   1: std::sys_common::backtrace::_print
             at /checkout/src/libstd/sys_common/backtrace.rs:71
   2: std::panicking::default_hook::{{closure}}
             at /checkout/src/libstd/sys_common/backtrace.rs:60
             at /checkout/src/libstd/panicking.rs:381
   3: std::panicking::default_hook
             at /checkout/src/libstd/panicking.rs:397
   4: std::panicking::rust_panic_with_hook
             at /checkout/src/libstd/panicking.rs:611
   5: std::panicking::begin_panic
             at /checkout/src/libstd/panicking.rs:572
   6: <html5ever::tokenizer::Tokenizer<Sink>>::discard_char
             at ./<panic macros>:3
   7: <html5ever::tokenizer::Tokenizer<Sink>>::step
             at /home/eijebong/.cargo/registry/src/github.com-1ecc6299db9ec823/html5ever-0.19.0/src/tokenizer/mod.rs:570
   8: <html5ever::tokenizer::Tokenizer<Sink>>::run
             at /home/eijebong/.cargo/registry/src/github.com-1ecc6299db9ec823/html5ever-0.19.0/src/tokenizer/mod.rs:362
   9: <html5ever::tokenizer::Tokenizer<Sink>>::feed
             at /home/eijebong/.cargo/registry/src/github.com-1ecc6299db9ec823/html5ever-0.19.0/src/tokenizer/mod.rs:220
  10: <html5ever::driver::Parser<Sink> as tendril::stream::TendrilSink<tendril::fmt::UTF8>>::process
             at /home/eijebong/.cargo/registry/src/github.com-1ecc6299db9ec823/html5ever-0.19.0/src/driver.rs:88
  11: <tendril::stream::Utf8LossyDecoder<Sink, A> as tendril::stream::TendrilSink<tendril::fmt::Bytes, A>>::process
             at /home/eijebong/.cargo/registry/src/github.com-1ecc6299db9ec823/tendril-0.4.0/src/stream.rs:179
  12: tendril::stream::TendrilSink::read_from
             at /home/eijebong/.cargo/registry/src/github.com-1ecc6299db9ec823/tendril-0.4.0/src/stream.rs:79
  13: rust_fuzzer_test_input
             at fuzz/fuzz_targets/fuzz_target_1.rs:17
  14: libfuzzer_sys::test_input_wrap::{{closure}}
             at /home/eijebong/.cargo/git/checkouts/libfuzzer-sys-e07fde05820d7bc6/67f7399/src/lib.rs:11
  15: std::panicking::try::do_call
             at /checkout/src/libstd/panicking.rs:480
  16: <unknown>
             at /checkout/src/libpanic_abort/lib.rs:40
@Eijebong
Copy link
Contributor Author

Looked at it a little bit. It seems that https://github.com/servo/html5ever/blob/master/html5ever/src/tokenizer/mod.rs#L252 is the problem since the string that's failing has a '\r\n' at the end of the document.

So it hits get_preprocessed_char with '\r', self.ignore_lf is set to true and '\n' is processed.
It then re hits get_preprocessed_char but self.ignore_lf is true, so it fetches the next character which doesn't exist. It then returns None, so this assertion fails https://github.com/servo/html5ever/blob/master/html5ever/src/tokenizer/mod.rs#L555

It seems to me that removing the assert would be an easy way of fixing that but I'm not sure if it's the right one.

@jdm
Copy link
Member

jdm commented Oct 24, 2017

@milindjain0 You could fix this bug. I agree that removing the assertion is the right solution.

@krk
Copy link

krk commented Apr 23, 2018

This one fails with 0.22.2 too.

thread 'main' panicked at 'assertion failed: c.is_some()', C:\Users\user\.cargo\registry\src\github.com-1ecc6299db9ec823\html5ever-0.22.2\src\tokenizer\mod.rs:554:9
stack backtrace:
   0: std::sys::windows::backtrace::unwind_backtrace
             at C:\projects\rust\src\libstd\sys\windows\backtrace\mod.rs:65
   1: std::sys_common::backtrace::_print
             at C:\projects\rust\src\libstd\sys_common\backtrace.rs:71
   2: std::sys_common::backtrace::print
             at C:\projects\rust\src\libstd\sys_common\backtrace.rs:59
   3: std::panicking::default_hook::{{closure}}
             at C:\projects\rust\src\libstd\panicking.rs:207
   4: std::panicking::default_hook
             at C:\projects\rust\src\libstd\panicking.rs:223
   5: std::panicking::rust_panic_with_hook
             at C:\projects\rust\src\libstd\panicking.rs:402
   6: std::panicking::begin_panic<str*>
             at C:\projects\rust\src\libstd\panicking.rs:365
   7: html5ever::tokenizer::Tokenizer<html5ever::tree_builder::TreeBuilder<alloc::rc::Rc<markup5ever::rcdom::Node>, markup5ever::rcdom::RcDom>>::discard_char<html5ever::tree_builder::TreeBuilder<alloc::rc::Rc<markup5ever::rcdom::Node>, markup5ever::rcdom::RcDom>>
             at \<panic macros>:3
   8: html5ever::tokenizer::Tokenizer<html5ever::tree_builder::TreeBuilder<alloc::rc::Rc<markup5ever::rcdom::Node>, markup5ever::rcdom::RcDom>>::step<html5ever::tree_builder::TreeBuilder<alloc::rc::Rc<markup5ever::rcdom::Node>, markup5ever::rcdom::RcDom>>
             at C:\Users\user\.cargo\registry\src\github.com-1ecc6299db9ec823\html5ever-0.22.2\src\tokenizer\mod.rs:569
   9: html5ever::tokenizer::Tokenizer<html5ever::tree_builder::TreeBuilder<alloc::rc::Rc<markup5ever::rcdom::Node>, markup5ever::rcdom::RcDom>>::run<html5ever::tree_builder::TreeBuilder<alloc::rc::Rc<markup5ever::rcdom::Node>, markup5ever::rcdom::RcDom>>
             at C:\Users\user\.cargo\registry\src\github.com-1ecc6299db9ec823\html5ever-0.22.2\src\tokenizer\mod.rs:361
  10: html5ever::tokenizer::Tokenizer<html5ever::tree_builder::TreeBuilder<alloc::rc::Rc<markup5ever::rcdom::Node>, markup5ever::rcdom::RcDom>>::feed<html5ever::tree_builder::TreeBuilder<alloc::rc::Rc<markup5ever::rcdom::Node>, markup5ever::rcdom::RcDom>>
             at C:\Users\user\.cargo\registry\src\github.com-1ecc6299db9ec823\html5ever-0.22.2\src\tokenizer\mod.rs:219
  11: html5ever::driver::{{impl}}::process<markup5ever::rcdom::RcDom>
             at C:\Users\user\.cargo\registry\src\github.com-1ecc6299db9ec823\html5ever-0.22.2\src\driver.rs:88
  12: tendril::stream::{{impl}}::process<html5ever::driver::Parser<markup5ever::rcdom::RcDom>,tendril::tendril::NonAtomic>
             at C:\Users\user\.cargo\registry\src\github.com-1ecc6299db9ec823\tendril-0.4.0\src\stream.rs:179
  13: tendril::stream::TendrilSink::read_from<tendril::stream::Utf8LossyDecoder<html5ever::driver::Parser<markup5ever::rcdom::RcDom>, tendril::tendril::NonAtomic>,tendril::fmt::Bytes,tendril::tendril::NonAtomic,slice<u8>*>
             at C:\Users\user\.cargo\registry\src\github.com-1ecc6299db9ec823\tendril-0.4.0\src\stream.rs:79

mozfreddyb added a commit to mozfreddyb/html5ever that referenced this issue May 24, 2018
mozfreddyb added a commit to mozfreddyb/html5ever that referenced this issue Aug 5, 2018
mozfreddyb added a commit to mozfreddyb/html5ever that referenced this issue Aug 5, 2018
mozfreddyb added a commit to mozfreddyb/html5ever that referenced this issue Aug 5, 2018
bors-servo pushed a commit that referenced this issue Aug 20, 2018
Remove unrequired assertion for invalid markup (fixes #305)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants