fuzzing harfbuzz #139

Open
kcc opened this Issue Oct 9, 2015 · 35 comments

Projects

None yet

2 participants

@kcc
Collaborator
kcc commented Oct 9, 2015

This in an umbrella issue for setting up regular fuzzing for harfbuzz and fixing the bugs that we find with fuzzing.

The starting point is the target function below used with libFuzzer.

#include "src/hb.h"
#include "src/hb-ot.h"
#include "Fuzzer/FuzzerInterface.h"

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
  const char text[] = "ABCDEXYZ123@_%&)*$!";

  hb_blob_t *blob = hb_blob_create((const char *)data, size,
                                   HB_MEMORY_MODE_READONLY, NULL, NULL);
  hb_face_t *face = hb_face_create(blob, 0);
  hb_font_t *font = hb_font_create(face);
  hb_ot_font_set_funcs(font);
  hb_font_set_scale(font, 12, 12);

  hb_buffer_t *buffer = hb_buffer_create();
  hb_buffer_add_utf8(buffer, text, -1, 0, -1);
  hb_buffer_guess_segment_properties(buffer);

  hb_shape(font, buffer, NULL, 0);

  hb_buffer_destroy(buffer);
  hb_font_destroy(font);
  hb_face_destroy(face);
  hb_blob_destroy(blob);
  return 0;
}

Eventually we'll need to submit this function to harfbuzz repo and extend it to cover more code.

Currently, this is my workflow to build the fuzzer:

  1. Get fresh llvm and build libFuzzer.
(
cd harfbuzz
make distclean
SAN=-fsanitize=address
COV=-fsanitize-coverage=edge,8bit-counters,trace-cmp

CXX="clang++ $SAN $COV" CC="clang $SAN $COV" CCLD="clang++ $SAN $COV" ../harfbuzz/configure --enable-static --disable-shared
make -j
)
clang++ -std=c++11 harfbuzz_fuzzer.cc -fsanitize=address -fsanitize-coverage=edge -I harfbuzz -I. ./harfbuzz/src/.libs/libharfbuzz.a -lglib-2.0 Fuzzer*.o -o harfbuzz_fuzzer
@kcc
Collaborator
kcc commented Oct 9, 2015

Two recent findings:
ee9b0b6
f396fbb

@kcc
Collaborator
kcc commented Oct 9, 2015

Next catch.
Reproducer:

echo AAEAAAADgCoAKvwNUEdEKkYAKeAAdAA9AAAAM0dQT1MA1UARAAAAAAoDAAMAAQAEAAgAAgADAAEABAABAA== | base64 --decode > reproducer

==12311==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60600000effd at pc 0x000000619d1b bp 0x7ffccad4b690 sp 0x7ffccad4b688
READ of size 1 at 0x60600000effd thread T0
    #0 0x619d1a in OT::BEInt<unsigned short, 2>::operator unsigned short() const src/./hb-open-type-private.hh:584:13
    #1 0x619d1a in OT::IntType<unsigned short, 2u>::operator unsigned short() const src/./hb-open-type-private.hh:632
    #2 0x619d1a in OT::ValueFormat::get_len() const src/./hb-ot-layout-gpos-table.hh:93
    #3 0x619d1a in OT::PairPosFormat1::sanitize(OT::hb_sanitize_context_t*) const src/./hb-ot-layout-gpos-table.hh:708
    #4 0x617363 in bool OT::hb_sanitize_context_t::dispatch<OT::PairPosFormat1>(OT::PairPosFormat1 const&) src/./hb-open-type-private.hh:205:52
    #5 0x617363 in OT::hb_sanitize_context_t::return_t OT::PairPos::dispatch<OT::hb_sanitize_context_t>(OT::hb_sanitize_context_t*) const src/./hb-ot-layout-gpos-table.hh:848
    #6 0x617363 in OT::hb_sanitize_context_t::return_t OT::PosLookupSubTable::dispatch<OT::hb_sanitize_context_t>(OT::hb_sanitize_context_t*, unsigned int) const src/./hb-ot-layout-gpos-table.hh:1410
    #7 0x6163d5 in OT::hb_sanitize_context_t::return_t OT::Lookup::dispatch<OT::PosLookupSubTable, OT::hb_sanitize_context_t>(OT::hb_sanitize_context_t*) const src/./hb-ot-layout-common-private.hh:614:40
    #8 0x6163d5 in OT::hb_sanitize_context_t::return_t OT::PosLookup::dispatch<OT::hb_sanitize_context_t>(OT::hb_sanitize_context_t*) const src/./hb-ot-layout-gpos-table.hh:1476
    #9 0x6163d5 in OT::PosLookup::sanitize(OT::hb_sanitize_context_t*) const src/./hb-ot-layout-gpos-table.hh:1482
    #10 0x615909 in OT::OffsetTo<OT::PosLookup, OT::IntType<unsigned short, 2u> >::sanitize(OT::hb_sanitize_context_t*, void const*) const src/./hb-open-type-private.hh:792:5
    #11 0x615909 in OT::ArrayOf<OT::OffsetTo<OT::PosLookup, OT::IntType<unsigned short, 2u> >, OT::IntType<unsigned short, 2u> >::sanitize(OT::hb_sanitize_context_t*, void const*) const src/./hb-open-type-private.hh:893
    #12 0x5cd859 in OT::OffsetListOf<OT::PosLookup>::sanitize(OT::hb_sanitize_context_t*) const src/./hb-open-type-private.hh:950:5
    #13 0x5cd859 in OT::OffsetTo<OT::OffsetListOf<OT::PosLookup>, OT::IntType<unsigned short, 2u> >::sanitize(OT::hb_sanitize_context_t*, void const*) const src/./hb-open-type-private.hh:792
    #14 0x5cd859 in OT::GPOS::sanitize(OT::hb_sanitize_context_t*) const src/./hb-ot-layout-gpos-table.hh:1507
    #15 0x5cd859 in OT::Sanitizer<OT::GPOS>::sanitize(hb_blob_t*) src/./hb-open-type-private.hh:337
    #16 0x5a7598 in _hb_ot_layout_create(hb_face_t*) src/hb-ot-layout.cc:60:23
    #17 0x50f4c6 in hb_ot_shaper_face_data_ensure(hb_face_t*) src/./hb-shaper-list.hh:43:1
    #18 0x50f4c6 in hb_shape_plan_plan(hb_shape_plan_t*, hb_feature_t const*, unsigned int, char const* const*) src/./hb-shaper-list.hh:43
    #19 0x50f4c6 in hb_shape_plan_create src/hb-shape-plan.cc:149
    #20 0x512be5 in hb_shape_plan_create_cached src/hb-shape-plan.cc:458:33
    #21 0x50e02e in hb_shape_full src/hb-shape.cc:374:33
    #22 0x50e02e in hb_shape src/hb-shape.cc:405
    #23 0x4d3936 in LLVMFuzzerTestOneInput 

0x60600000effd is located 0 bytes to the right of 61-byte region [0x60600000efc0,0x60600000effd)
allocated by thread T0 here:
    #0 0x4a92db in __interceptor_malloc 
    #1 0x4d786b in _try_writable(hb_blob_t*) src/hb-blob.cc:465:23
    #2 0x5a7598 in _hb_ot_layout_create(hb_face_t*) src/hb-ot-layout.cc:60:23
    #3 0x50f4c6 in hb_ot_shaper_face_data_ensure(hb_face_t*) src/./hb-shaper-list.hh:43:1
    #4 0x50f4c6 in hb_shape_plan_plan(hb_shape_plan_t*, hb_feature_t const*, unsigned int, char const* const*) src/./hb-shaper-list.hh:43
    #5 0x50f4c6 in hb_shape_plan_create src/hb-shape-plan.cc:149
@behdad behdad added a commit that referenced this issue Oct 12, 2015
@behdad [fuzz] Add fuzzing script from kcc@ c1a5dc4
@behdad behdad closed this in f966649 Oct 13, 2015
@behdad
Owner
behdad commented Oct 13, 2015

Should be fixed. Please reopen with next issue :).

@kcc
Collaborator
kcc commented Oct 13, 2015

Gooood! This has unblocked further fuzzing.
Next item that is a minor headache for fuzzing, but is also a bug worth fixing:

==1899==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 152 byte(s) in 1 object(s) allocated from:
    #0 0x4b79fc in calloc /home/kcc/llvm/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:56
    #1 0x4e9bf3 in _ZL16hb_object_createI9hb_blob_tEPT_v src/./hb-object-private.hh:129:24
    #2 0x4e9bf3 in hb_blob_create src/hb-blob.cc:108
    #3 0x4e9bf3 in hb_blob_create_sub_blob src/hb-blob.cc:164
    #4 0x50acf4 in _hb_face_for_data_reference_table(hb_face_t*, unsigned int, void*) src/hb-face.cc:146:21
    #5 0x52f135 in hb_face_t::reference_table(unsigned int) const src/./hb-face-private.hh:72:12
    #6 0x52f135 in hb_ot_face_glyf_accelerator_t::init(hb_face_t*) src/hb-ot-font.cc:121
    #7 0x5293c4 in _hb_ot_font_create(hb_face_t*) src/hb-ot-font.cc:253:3
    #8 0x5293c4 in hb_ot_font_set_funcs src/hb-ot-font.cc:425
    #9 0x4e4ce4 in LLVMFuzzerTestOneInput /home/kcc/harfbuzz/test/fuzzing/hb-fuzzer.cc:12:3
    #10 0x4e5633 in main /home/kcc/harfbuzz/test/fuzzing/hb-fuzzer.cc:44:5
    #11 0x7fc341b06ec4 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21ec4)
    #12 0x41a4ce in _start (/home/kcc/harfbuzz_run+0x41a4ce)
Indirect leak of 152 byte(s) in 1 object(s) allocated from:
    #0 0x4b79fc in calloc /home/kcc/llvm/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:56
    #1 0x4e8bf5 in _ZL16hb_object_createI9hb_blob_tEPT_v src/./hb-object-private.hh:129:24
    #2 0x4e8bf5 in hb_blob_create src/hb-blob.cc:108
    #3 0x4e4cbb in LLVMFuzzerTestOneInput /home/kcc/harfbuzz/test/fuzzing/hb-fuzzer.cc:8:21
    #4 0x4e5633 in main /home/kcc/harfbuzz/test/fuzzing/hb-fuzzer.cc:44:5
    #5 0x7fc341b06ec4 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21ec4)
    #6 0x41a4ce in _start (/home/kcc/harfbuzz_run+0x41a4ce)

Reproducer in base64:

AAEAAAAOAIAAAwBgR1NVQuIT4Q4AAAgYAAAARE9TLzJxolP2AAAB0AAAAFZjbWFwAAwwXwAAAigA
AAA0Y3YAAQAAAAMCAgAqRUZRDfzQU0cpA+AAAD0AAAA0R1BPUwAVSAEAAAAAAwoAAQAEAAV0IA/A
EAAAAAWUAAACAGZwZ20BUpwYAAACXAAAALNnbHlmR9rfuAAAAOwAAABYaGVhZNv2AAAABAABAAMA
AgBoxARfNAAAAWwAAAA2aGhlYQejA0EAAAGsAAAAJGhtdHgGyAAAAq0AAAGkAAAACGxvY2EAFgAs
AAABZAAAAAhtYXhwGDUSawAAAUQAAAAgbmFtZQOoDBEAAAeUAAAARHBvc3QdaKP9AAAH2AAAAD1w
cmVwDyU+pQAAAxAAAAKCAAECyAA= 
@behdad behdad reopened this Oct 13, 2015
@behdad
Owner
behdad commented Oct 13, 2015

You should be able to reopen issues now.

@behdad
Owner
behdad commented Oct 13, 2015

Leak fixed.

@behdad behdad added a commit that referenced this issue Oct 13, 2015
@behdad [ot-font] Fix leak ed13e2c
@kcc
Collaborator
kcc commented Oct 13, 2015

Now, the bane of fuzzing: non-linear algorithms.
I am attaching several inputs that consume multiple seconds each.
In case these are different problems, please check all after fixing one.

For slow-unit-03b06616c9de07c85802e6a9a0652354266af8fd (takes 5 minutes to finish)
the profile looks like this:

 37.81%  harfbuzz_run  harfbuzz_run       [.] OT::apply_lookup(OT::hb_apply_context_t*, unsigned int, unsigned int*, unsigned int, OT::LookupRecord const*, unsigned int)
 21.42%  harfbuzz_run  harfbuzz_run       [.] hb_buffer_t::move_to(unsigned int)
 11.99%  harfbuzz_run  harfbuzz_run       [.] OT::SubstLookup::apply_recurse_func(OT::hb_apply_context_t*, unsigned int)

ARTIFACTS.tgz.pdf

(hm... github does not allow to upload .tgz file. The .pdf file attached is actually a tgz file. Weird)

@behdad behdad added a commit that referenced this issue Oct 14, 2015
@behdad Reduce max nesting level from 8 to 6
We probably should implement better system to catch cyclic lookups.
But for now, this speeds up worst case behavior with broken fonts
considerably without compromising legitimate usecases.

#139 (comment)
613e630
@behdad
Owner
behdad commented Oct 14, 2015

I pushed out a change that makes these edge cases significantly faster (over 10x). We still can do better, but this should give you a boost for now.

@kcc
Collaborator
kcc commented Oct 15, 2015

Did some tricks out of my sleeve...

==304==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61f00000fa98 at pc 0x0000005188a4 bp 0x7fffc8c70cf0 sp 0x7fffc8c70ce8
READ of size 1 at 0x61f00000fa98 thread T0
    #0 0x5188a3 in OT::BEInt<unsigned short, 2>::operator unsigned short() const src/./hb-open-type-private.hh:583:13
    #1 0x5188a3 in OT::IntType<unsigned short, 2u>::operator unsigned short() const src/./hb-open-type-private.hh:632
    #2 0x5188a3 in hb_ot_face_metrics_accelerator_t::get_advance(unsigned int) const src/hb-ot-font.cc:92
    #3 0x5188a3 in hb_ot_get_glyph_h_advance(hb_font_t*, void*, unsigned int, void*) src/hb-ot-font.cc:290
    #4 0x52c9cb in hb_font_t::get_glyph_h_advance(unsigned int) src/./hb-font-private.hh:164:12
    #5 0x52c9cb in hb_font_t::get_glyph_advance_for_direction(unsigned int, hb_direction_t, int*, int*) src/./hb-font-private.hh:257
    #6 0x52c9cb in hb_ot_position_default(hb_ot_shape_context_t*) src/hb-ot-shape.cc:652
    #7 0x52c9cb in hb_ot_position(hb_ot_shape_context_t*) src/hb-ot-shape.cc:752
    #8 0x52c9cb in hb_ot_shape_internal(hb_ot_shape_context_t*) src/hb-ot-shape.cc:796
    #9 0x52c9cb in _hb_ot_shape src/hb-ot-shape.cc:816
    #10 0x51130b in hb_shape_plan_execute src/./hb-shaper-list.hh:43:1
    #11 0x50e325 in hb_shape_full src/hb-shape.cc:375:19
    #12 0x50e325 in hb_shape src/hb-shape.cc:405
    #13 0x4d3946 in LLVMFuzzerTestOneInput

0x61f00000fa98 is located 127 bytes to the right of 1049-byte region [0x61f00000f600,0x61f00000fa19)
allocated by thread T0 here:
    #0 0x4d19eb in operator new(unsigned long)
    #1 0x7fd000411668 in __gnu_cxx::new_allocator<char>::allocate(unsigned long, void const*) 
    #2 0x7fd000411668 in std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) 
    #3 0x4d5474 in char* std::string::_S_construct_aux<std::istreambuf_iterator<char, std::char_traits<char> > >(std::istreambuf_iterator<char, std::char_traits<char> >, std::
    #4 0x4d4fe4 in char* std::string::_S_construct<std::istreambuf_iterator<char, std::char_traits<char> > >(std::istreambuf_iterator<char, std::char_traits<char> >, std::istr
    #5 0x4d4b03 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string<std::istreambuf_iterator<char, std::char_traits<char> > >(std::istreamb
    #6 0x4d3c6f in FileToString(std::string const&) 

Note the "is located 127 bytes to the right of 1049-byte region" bit.
you will need ASAN_OPTIONS=redzone=128 to get reliable report.

Repro (base64)

AAEAAAALAIAAAwAwR1NVQrKltSEAACNoAAAAfk9TLzJpynqEAAABOAAAAGBjbWFwAGUDrAAAAogA
AACIZ2x5ZsL9NwYAAAOMAAAZMmhlYWQw2sRrAAAAvAAAADZoaGVhOOY4LQAAAPQAAAAkaG10eKLe
DeEAAAOYAAAA8GxvY2HhQ+fQAAADEAAAAHxtYXhwAF8DbwAAARgAAAAgbmFtZUVYv/YAABzAAAAF
EHBvc3TZ1eHyAAAh0AAAAZcAAQAAAAVMzIO9wBJfDzz1Ah0IAAAAAADMFoLdAAAAANHWrBb/nv4Z
MhwFlgAAAAgAAAAAAAAAAAABAAAHJ/4IAAA6mP+e/1EyHgABAAAAAAAAAAAAAAAAAAAAOwABAAAA
PQBdAAMCzQAVAAEAAABBAAAAAAAAAAAACQADAAMP5gGQAAUAAAUzBZkAAAEeBTMFmQAAA9cAZgIS
AQUCAAUDAAAAAAAAAAAAAQAAAAAAAAAAAAAAAFBmRWQAQAAgAHoHJ/4IAAAHJwH4AAAAAQAAAAAD
bwVEAEAAFAAHBAAAAAIAAAADuABQA7gAtgO4AGoDuABaA7gAOQO4AG8DuABaA7gAaAO4AF4DuABm
A6cASgPxABADbABMBAwAUAOTAEwCegAtBAAAQgRNABwCKwA1Ai3/ngQYAB0CHAAlBlEALQRWAC0E
CABUBCYACAQGAEgC+QAtAx4AYgKHADMEPwAtA/kADAX5AAwD6wAhBB4AGQNkAEwOEABODhAATg4Q
AE4OEABODhAATg4QAE4OEABODhAATg4QAE46mABOOpgATjqYAE46mABOOpgATjqYAE46mABOOpgA
TjqYAE46mAAAOpgAABOIAAAAAAAAAAAABAAAAAMAAAAkAAAACgAAAFQAAwABAAAAJAADAAoAAABU
AAQAMAAAAAgACAACAAAAIAA5AHr//wAAACAAMABh////4f/S/6sAAQAAAAAAAAAAAAwAAAAAADQA
AAAAAAAAAwAAACAAAAAgAAAAAQAAADAAAAA5AAAAOwIAAABhAAAAegAAAAwAAAAAAAAAOwBrALEB
BQFNAYwBxgHzAj4CeALLAxUDRAOfA9YEHASVBPwFOgWBBfQGIwanBwMHMQePB+QILQh5CLQJCglT
CckKQgqbCtkK6Qr5CwkLGQspCzkLSQtZC2kLhQuRC50LxQvRC90MAQwNDBkMNQxVDG0MhQyZAAIA
UP/sA2AE4QARACIAAAEiDgIVEBcxPgI3NhEQJyYDIicmAjUQEjMyFxYRFAIGBgHbJ0VEKdMSKTgz
ECN3Jz16VlRf66CWZItOfIQEmjJ2/LL98AEMI1A8fwEmAX9lI/tSVlMBIKUBIwFkhbj+yar++Q==
@behdad behdad added a commit that referenced this issue Oct 15, 2015
@behdad [ot-font] Fix hmtx wrong table length check
Discovered by libFuzzer.  Ouch!

#139 (comment)
63ef0b4
@kcc
Collaborator
kcc commented Oct 15, 2015

one more case of a very slow run (23 seconds)
slow-unit-b1e4f88fa1687e01e20b5708ad2f229631c1e460.pdf Not really a pdf file...

@kcc
Collaborator
kcc commented Oct 16, 2015

The public bot is up and running (similar to the one for FreeType).
It is very inefficient right now due to slow inputs, some taking over 200 seconds (with 4K size limit).

Steps once the slowness is fixed:

  • give it some time to saturate
  • look at the coverage dump, see what's not covered
  • extend the test corpus with inputs that will improve coverage
  • extend the target function itself (e.g. to deal with unicode text)
@kcc
Collaborator
kcc commented Oct 17, 2015

The attached input takes over 400 seconds to process, all time is spent in OT::apply_lookup
slow-unit-b599e7814a2a9343b4aa8bedd4f275ca9d2e6fd2.pdf

@behdad
Owner
behdad commented Oct 17, 2015

The attached input takes over 400 seconds to process, all time is spent in OT::apply_lookup
slow-unit-b599e7814a2a9343b4aa8bedd4f275ca9d2e6fd2.pdf

Thanks. Will take a look. The problem is, the OpenType layout model is very expressive. You can easily get exponential expansion from it. That said, by handling circular recursions we might get most of those handled already...

@jsonn jsonn pushed a commit to jsonn/pkgsrc that referenced this issue Oct 21, 2015
wiz Update harfbuzz to 1.0.6:
Overview of changes leading to 1.0.6
Thursday, October 15, 2015
====================================

- Reduce max nesting level in OT lookups from 8 to 6.
  Should not affect any real font as far as I know.
- Fix memory access issue in ot-font.
- Revert default load-flags of fonts created using hb_ft_font_create()
  back to FT_LOAD_DEFAULT|FT_LOAD_NO_HINTING.  This was changed in
  last release (1.0.5), but caused major issues, so revert.
  behdad/harfbuzz#143


Overview of changes leading to 1.0.5
Tuesday, October 13, 2015
====================================

- Fix multiple memory access bugs discovered using libFuzzer.
  behdad/harfbuzz#139
  Everyone should upgrade to this version as soon as possible.
  We now have continuous fuzzing set up, to avoid issues like
  these creeping in again.
- Misc fixes.

- New API:
  * hb_font_set_parent().
  * hb_ft_font_[sg]et_load_flags()
    The default flags for fonts created using hb_ft_font_create()
    has changed to default to FT_LOAD_DEFAULT now.  Previously it
    was defaulting to FT_LOAD_DFEAULT|FT_LOAD_NO_HINTING.

- API changes:
  * Fonts now default to units-per-EM as their scale, instead of 0.
  * hb_font_create_sub_font() does NOT make parent font immutable
    anymore.  hb_font_make_immutable() does.
0b594d6
@behdad
Owner
behdad commented Oct 28, 2015

The attached input takes over 400 seconds to process, all time is spent in OT::apply_lookup
slow-unit-b599e7814a2a9343b4aa8bedd4f275ca9d2e6fd2.pdf
Thanks. Will take a look. The problem is, the OpenType layout model is very expressive. You can easily get exponential expansion from it. That said, by handling circular recursions we might get most of those handled already...

To better understand this, see my new presentation, read the speaker notes, where I show that OpenType lookups are Turing complete if recursion depth is not limited...

https://goo.gl/yWYUrd

@kcc
Collaborator
kcc commented Oct 29, 2015

Is any workaround possible? Some way to limit the recursion (under a flag?) and return early?
The fuzzing is completely stalled :(

@behdad
Owner
behdad commented Oct 29, 2015

We already have a max recursion depth of 6 in place. I haven't had time to look at your worst examples to see what other measures I might be able to put in place.

@kcc
Collaborator
kcc commented Nov 3, 2015

despite the slow runs, here is one more trophy.
having all reports in this single bug actually looks silly to me now. Will be filing individual bugs separately. New one: #156

@behdad
Owner
behdad commented Nov 3, 2015

We already have a max recursion depth of 6 in place. I haven't had time to look at your worst examples to see what other measures I might be able to put in place.

I still haven't studied the examples, but in the mean time I added a compile time way to limit the recursion depth even further. Just define HB_MAX_NESTING_LEVEL to 2 or 3. Setting it to 1 might be too limiting.

@behdad behdad added a commit that referenced this issue Nov 3, 2015
@behdad [ot-font] Fix hmtx table length checking, *again*
Exactly the same problem that I fixed in
63ef0b4

I rewrote the table checking yesterday in
67f8821
and introduced the exact same issue again. :(
Good thing we have ongoing fuzzing going now.  Was discovered
immediately by libFuzzer.  Thanks kcc!

#139 (comment)
Fixes #156
df698f3
@kcc kcc added a commit to google/libfuzzer-bot that referenced this issue Nov 3, 2015
@kcc kcc [harfbuzz] use -DHB_MAX_NESTING_LEVEL=3 (behdad/harfbuzz#139, behdad/… ae2f8bd
@kcc
Collaborator
kcc commented Nov 5, 2015

In response to #157 (comment)

can you send me a with a README file

Something like this?

In order to build the fuzzer one needs to 
  - build harfbuz and harfbuzz/test/fuzzing/hb-fuzzer.cc
     * Using the most recent Clang
     * With -fsanitize=address (or =undefined, or a combination)
     * With -fsanitize-coverage=edge[,8bit-counters,trace-cmp]
     * With various defines that limit worst case exponential behavior.
        * -DHB_MAX_NESTING_LEVEL=3 
  - link against libFuzzer

To run the fuzzer one needs to first obtain a test corpus 
as a directory containing interesting fonts.
Then, run the fuzzer like this:
   ./hb-fuzzer -max_len=2048 CORPUS_DIR
Where max_len specifies the maximal length of files to handle. The smaller the better. 

For more details consult llvm.org/docs/LibFuzzer.html or
https://github.com/google/libfuzzer-bot/tree/master/harfbuzz

I can then add a Makefile.am for the most relevant parts.

That would be great. The more code gets moved from google/libfuzzer-bot to this repo the better.

@behdad behdad added a commit that referenced this issue Nov 6, 2015
@behdad [fuzzer] Add README 9c9ad21
@behdad
Owner
behdad commented Nov 6, 2015

In response to #157 (comment)

    can you send me a with a README file

Something like this?

In order to build the fuzzer one needs to

  • build harfbuz and harfbuzz/test/fuzzing/hb-fuzzer.cc
    • Using the most recent Clang
    • With -fsanitize=address (or =undefined, or a combination)
    • With -fsanitize-coverage=edge[,8bit-counters,trace-cmp]
    • With various defines that limit worst case exponential behavior.
      • -DHB_MAX_NESTING_LEVEL=3
  • link against libFuzzer

To run the fuzzer one needs to first obtain a test corpus
as a directory containing interesting fonts.
Then, run the fuzzer like this:
./hb-fuzzer -max_len=2048 CORPUS_DIR
Where max_len specifies the maximal length of files to handle. The smaller the better.

For more details consult llvm.org/docs/LibFuzzer.html or
https://github.com/google/libfuzzer-bot/tree/master/harfbuzz

Here:
https://github.com/behdad/harfbuzz/tree/master/test/fuzzing

    I can then add a Makefile.am for the most relevant parts.

That would be great. The more code gets moved from google/libfuzzer-bot to this repo the better.

Great. Send me anything else you think can be moved.

In the mean time, any news on adding a fuzzing vector for UTF-8 text?

@fanc999 fanc999 added a commit to fanc999/harfbuzz that referenced this issue Nov 6, 2015
@behdad @fanc999 + fanc999 [fuzzer] Add README 63bf580
@behdad
Owner
behdad commented Nov 6, 2015

I'm adding a build system. Had to change include "src/hb.h" to include <hb.h>, hope you can deal with it on your end.

@fanc999 fanc999 added a commit to fanc999/harfbuzz that referenced this issue Nov 6, 2015
@behdad @fanc999 + fanc999 [fuzzer] Add README 675c2b8
@fanc999 fanc999 added a commit to fanc999/harfbuzz that referenced this issue Nov 6, 2015
@behdad @fanc999 + fanc999 [fuzzer] Add README 5d0461f
@fanc999 fanc999 added a commit to fanc999/harfbuzz that referenced this issue Nov 6, 2015
@behdad @fanc999 + fanc999 [fuzzer] Add README b40b98f
@fanc999 fanc999 added a commit to fanc999/harfbuzz that referenced this issue Nov 6, 2015
@behdad @fanc999 + fanc999 [fuzzer] Add README 0aa89d1
@fanc999 fanc999 added a commit to fanc999/harfbuzz that referenced this issue Nov 6, 2015
@behdad @fanc999 + fanc999 [fuzzer] Add README 6ce672f
@simoncozens simoncozens added a commit to simoncozens/harfbuzz that referenced this issue Nov 6, 2015
@behdad @simoncozens + simoncozens [ot-font] Fix hmtx table length checking, *again*
Exactly the same problem that I fixed in
63ef0b4

I rewrote the table checking yesterday in
67f8821
and introduced the exact same issue again. :(
Good thing we have ongoing fuzzing going now.  Was discovered
immediately by libFuzzer.  Thanks kcc!

behdad#139 (comment)
Fixes behdad#156
25279f6
@simoncozens simoncozens added a commit to simoncozens/harfbuzz that referenced this issue Nov 6, 2015
@behdad @simoncozens + simoncozens [fuzzer] Add README 73ca386
@kcc
Collaborator
kcc commented Nov 6, 2015

In the mean time, any news on adding a fuzzing vector for UTF-8 text?

So far I don't have any nice solution for cross-fuzzing two kinds of inputs.
It's on my radar, but I want to collect more motivating examples first. (and get some time for quality design).

questions:
how long should be the text to touch interesting parts of code?
does it have to be a valid utf-8?

@jsonn jsonn pushed a commit to jsonn/pkgsrc that referenced this issue Nov 6, 2015
wiz Update harfbuzz to 1.0.6:
Overview of changes leading to 1.0.6
Thursday, October 15, 2015
====================================

- Reduce max nesting level in OT lookups from 8 to 6.
  Should not affect any real font as far as I know.
- Fix memory access issue in ot-font.
- Revert default load-flags of fonts created using hb_ft_font_create()
  back to FT_LOAD_DEFAULT|FT_LOAD_NO_HINTING.  This was changed in
  last release (1.0.5), but caused major issues, so revert.
  behdad/harfbuzz#143


Overview of changes leading to 1.0.5
Tuesday, October 13, 2015
====================================

- Fix multiple memory access bugs discovered using libFuzzer.
  behdad/harfbuzz#139
  Everyone should upgrade to this version as soon as possible.
  We now have continuous fuzzing set up, to avoid issues like
  these creeping in again.
- Misc fixes.

- New API:
  * hb_font_set_parent().
  * hb_ft_font_[sg]et_load_flags()
    The default flags for fonts created using hb_ft_font_create()
    has changed to default to FT_LOAD_DEFAULT now.  Previously it
    was defaulting to FT_LOAD_DFEAULT|FT_LOAD_NO_HINTING.

- API changes:
  * Fonts now default to units-per-EM as their scale, instead of 0.
  * hb_font_create_sub_font() does NOT make parent font immutable
    anymore.  hb_font_make_immutable() does.
5b98002
@behdad
Owner
behdad commented Nov 7, 2015

In the mean time, any news on adding a fuzzing vector for UTF-8 text?

So far I don't have any nice solution for cross-fuzzing two kinds of inputs.
It's on my radar, but I want to collect more motivating examples first. (and get some time for quality design).

questions:
how long should be the text to touch interesting parts of code?

Less than 8 or 16 characters.

does it have to be a valid utf-8?

Doesn't have to be. The useful ones are. But HB should work regardless.

@kcc
Collaborator
kcc commented Nov 7, 2015

What if we use hb_buffer_add_utf32 instead of utf8?
Will this examine (almost) the same functionality?
If so, we can treat last N*4 bytes of the font file as the buffer in ut32.
This is similar to the strategy I've used for regular expression matchers.

@behdad
Owner
behdad commented Nov 12, 2015

What if we use hb_buffer_add_utf32 instead of utf8?
Will this examine (almost) the same functionality?

Yes. It wouldn't exercise the UTF-8 decoder, but that's minor.

If so, we can treat last N*4 bytes of the font file as the buffer in ut32.

Yes, that works. And you'd need a new set of seeds I assume?

This is similar to the strategy I've used for regular expression matchers.

@kcc
Collaborator
kcc commented Nov 12, 2015

Let's do it then (treat last N*4 bytes of font as UTF32 string).
I don't think we'll need more seeds -- UTF32 is trivial enough that the fuzzer should discover interesting strings by itself

@kcc
Collaborator
kcc commented Dec 1, 2015

We can use something like to fuzz utf32.

  uint32_t text32[16];
  if (size > sizeof(text32)) {
    memcpy(text32, data + size - sizeof(text32), sizeof(text32));
    hb_buffer_t *buffer = hb_buffer_create();
    hb_buffer_add_utf32(buffer, text32, sizeof(text32)/sizeof(text32[0]), 0, -1);
    hb_buffer_guess_segment_properties(buffer);
    hb_shape(font, buffer, NULL, 0);
    hb_buffer_destroy(buffer);
  }
@behdad
Owner
behdad commented Dec 7, 2015

We can use something like to fuzz utf32.

uint32_t text32[16];
if (size > sizeof(text32)) {
memcpy(text32, data + size - sizeof(text32), sizeof(text32));
hb_buffer_t *buffer = hb_buffer_create();
hb_buffer_add_utf32(buffer, text32, sizeof(text32)/sizeof(text32[0]), 0, -1);
hb_buffer_guess_segment_properties(buffer);
hb_shape(font, buffer, NULL, 0);
hb_buffer_destroy(buffer);
}

Can you send a pull-request for that? Thanks.

@kcc
Collaborator
kcc commented Dec 10, 2015

Maybe you could just apply the patch below?

--- a/test/fuzzing/hb-fuzzer.cc
+++ b/test/fuzzing/hb-fuzzer.cc
@@ -1,9 +1,9 @@
 #include <stddef.h>
 #include <hb.h>
 #include <hb-ot.h>
+#include <string.h>

 extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
-  const char text[] = "ABCDEXYZ123@_%&)*$!";

   hb_blob_t *blob = hb_blob_create((const char *)data, size,
                                    HB_MEMORY_MODE_READONLY, NULL, NULL);
@@ -12,13 +12,26 @@ extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
   hb_ot_font_set_funcs(font);
   hb_font_set_scale(font, 12, 12);

-  hb_buffer_t *buffer = hb_buffer_create();
-  hb_buffer_add_utf8(buffer, text, -1, 0, -1);
-  hb_buffer_guess_segment_properties(buffer);
+  {
+    const char text[] = "ABCDEXYZ123@_%&)*$!";
+    hb_buffer_t *buffer = hb_buffer_create();
+    hb_buffer_add_utf8(buffer, text, -1, 0, -1);
+    hb_buffer_guess_segment_properties(buffer);
+    hb_shape(font, buffer, NULL, 0);
+    hb_buffer_destroy(buffer);
+  }
+
+  uint32_t text32[16];
+  if (size > sizeof(text32)) {
+    memcpy(text32, data + size - sizeof(text32), sizeof(text32));
+    hb_buffer_t *buffer = hb_buffer_create();
+    hb_buffer_add_utf32(buffer, text32, sizeof(text32)/sizeof(text32[0]), 0, -1);
+    hb_buffer_guess_segment_properties(buffer);
+    hb_shape(font, buffer, NULL, 0);
+    hb_buffer_destroy(buffer);
+  }

-  hb_shape(font, buffer, NULL, 0);

-  hb_buffer_destroy(buffer);
   hb_font_destroy(font);
   hb_face_destroy(face);
   hb_blob_destroy(blob);

(Or I'll need some time to figure out how pull requests work...)

@behdad
Owner
behdad commented Dec 10, 2015

Will do. Thanks.

@behdad
Owner
behdad commented Dec 10, 2015

I also need to change the fuzzing code to print out the UTF-32 test, such that we can easily reproduce with hb-shape or copy into our test suite.

@behdad behdad added a commit that referenced this issue Jan 11, 2016
@behdad [fuzzing] Fuzz input text
Very rudimentary right now, but will get kcc's bot going.

From
#139 (comment)
505b7d4
@behdad
Owner
behdad commented Jan 11, 2016

Maybe you could just apply the patch below?

Done.

@kcc
Collaborator
kcc commented Oct 6, 2016

Have you seen this?

harfbuzz-1.3.2: hb-buffer.cc:419: bool hb_buffer_t::move_to(unsigned int): Assertion `i <= out_len + (len - idx)' failed.
==29603== ERROR: libFuzzer: deadly signal
    #8 0x4f66e3 in hb_buffer_t::move_to(unsigned int) /usr/local/google/home/kcc/tmp/hb/BUILD/src/hb-buffer.cc:419:3
    #9 0x5ad271 in OT::apply_lookup(OT::hb_apply_context_t*, unsigned int, unsigned int*, unsigned int, OT::LookupRecord const*, unsigned int) hb-ot-layout-gsubgpos-private.hh:1036:11
    #10 0x610684 in OT::chain_context_apply_lookup(OT::hb_apply_context_t*, unsigned int, OT::IntType<unsigned short, 2u> const*, unsigned int, OT::IntType<unsigned short, 2u> const*, unsigned int, OT::IntType<unsigned short, 2u> const*, unsigned int, OT::LookupRecord const*, OT::ChainContextApplyLookupContext&) hb-ot-layout-gsubgpos-private.hh:1649:10
    #11 0x610684 in OT::ChainContextFormat3::apply(OT::hb_apply_context_t*) const hb-ot-layout-gsubgpos-private.hh:2089
    #12 0x604032 in bool OT::hb_apply_context_t::dispatch<OT::ChainContextFormat3>(OT::ChainContextFormat3 const&) hb-ot-layout-gsubgpos-private.hh:446:56
    #13 0x604032 in OT::hb_apply_context_t::return_t OT::ChainContext::dispatch<OT::hb_apply_context_t>(OT::hb_apply_context_t*) const hb-ot-layout-gsubgpos-private.hh:2140
    #14 0x604032 in OT::hb_apply_context_t::return_t OT::SubstLookupSubTable::dispatch<OT::hb_apply_context_t>(OT::hb_apply_context_t*, unsigned int) const hb-ot-layout-gsub-table.hh:1081
    #15 0x602d87 in OT::hb_apply_context_t::return_t OT::Lookup::dispatch<OT::SubstLookupSubTable, OT::hb_apply_context_t>(OT::hb_apply_context_t*) const hb-ot-layout-common-private.hh:628:71
    #16 0x602d87 in OT::hb_apply_context_t::return_t OT::SubstLookup::dispatch<OT::hb_apply_context_t>(OT::hb_apply_context_t*) const hb-ot-layout-gsub-table.hh:1231

I am getting it on 1.3.2, repro attached. (gzip-ed)
crash-e88c339237f52d21e01c55f01b9c1b4cc14a0467.gz

@behdad behdad added a commit that referenced this issue Dec 22, 2016
@behdad Fix assert fail with contextual matching
As discovered by libFuzzer / Chromium fuzzing.

Fixes https://bugs.chromium.org/p/chromium/issues/detail?id=659496
CC #139
4b4a1b9
@behdad
Owner
behdad commented Dec 22, 2016

Have you seen this?

Cannot reproduce using hb-fuzzer. Can you double-check?

@iongchun iongchun added a commit to iongchun/harfbuzz that referenced this issue Jan 12, 2017
@behdad @iongchun + iongchun Fix assert fail with contextual matching
As discovered by libFuzzer / Chromium fuzzing.

Fixes https://bugs.chromium.org/p/chromium/issues/detail?id=659496
CC behdad#139
857af51
@behdad behdad added a commit that referenced this issue Feb 25, 2017
@behdad Fix buffer-overrun with Bengali reph positioning code
This has no security implications whatsoever since we always keep
and extra element at the end of buffer, just in case.

Discovered by oss-fuzz
CC #139
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=660
8563099
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment