memory corruption when encoding images with hextree #1087

Closed
springmeyer opened this Issue Feb 15, 2012 · 10 comments

Comments

Projects
None yet
2 participants
Owner

springmeyer commented Feb 15, 2012

  • leads to unpredictable crashes with new/delete.
  • Cannot replicate on OS X 10.7
  • Replicable on Ubuntu Oneiric 64 bit
  • testcase at https://gist.github.com/1881085
  • g++ 4.6
 g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.6.1/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.6.1-9ubuntu3' --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++,go --prefix=/usr --program-suffix=-4.6 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin --enable-objc-gc --disable-werror --with-arch-32=i686 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.6.1 (Ubuntu/Linaro 4.6.1-9ubuntu3)

Crashes are variable, but here is a common one:

   glibc detected *** node: malloc(): smallbin double linked list corrupted: 0x00000000021b17d0 ***
    ======= Backtrace: =========
    /lib/x86_64-linux-gnu/libc.so.6(+0x78a96)[0x7ff3712baa96]
    /lib/x86_64-linux-gnu/libc.so.6(+0x7accc)[0x7ff3712bcccc]
    /lib/x86_64-linux-gnu/libc.so.6(__libc_malloc+0x6d)[0x7ff3712be77d]
    /usr/lib/x86_64-linux-gnu/libstdc++.so.6(_Znwm+0x1d)[0x7ff371d524cd]
    /usr/lib/libmapnik.so.2.0(_ZN5boost16unordered_detail17hash_unique_tableINS0_3mapIN6mapnik4rgbaENS4_9hash_funcESt8equal_toIS4_ESaISt4pairIKS4_iEEEEEixERS9_+0x10a)[0x7ff34c8958ca]
    /usr/lib/libmapnik.so.2.0(_ZNK6mapnik7hextreeINS_4rgbaENS_10RGBAPolicyEE8quantizeERKS1_+0x3aa)[0x7ff34c895dfa]
    /usr/lib/libmapnik.so.2.0(_ZN6mapnik12save_as_png8ISoNS_10image_viewINS_9ImageDataIjEEEENS_7hextreeINS_4rgbaENS_10RGBAPolicyEEEEEvRT_RKT0_RKT1_RKSt6vectorINS_3rgbESaISI_EERKSH_IjSaIjEEii+0x29a)[0x7ff34c8960ba]
    /usr/lib/libmapnik.so.2.0(_ZN6mapnik16save_as_png8_hexISoNS_10image_viewINS_9ImageDataIjEEEEEEvRT_RKT0_iiiid+0x1008)[0x7ff34c8972d8]
    /usr/lib/libmapnik.so.2.0(_ZN6mapnik14save_to_streamINS_10image_viewINS_9ImageDataIjEEEEEEvRKT_RSoRKSs+0x273)[0x7ff34c89caf3]
    /usr/lib/libmapnik.so.2.0(_ZN6mapnik14save_to_stringINS_10image_viewINS_9ImageDataIjEEEEEESsRKT_RKSs+0x2f)[0x7ff34c89cd5f]
    /usr/share/tilemill/node_modules/mapnik/lib/_mapnik.node(_ZN9ImageView10EIO_EncodeEP7eio_req+0x5a)[0x7ff34cc77a8a]
    node[0x538ab0]
    /lib/x86_64-linux-gnu/libpthread.so.0(+0x7efc)[0x7ff3715e8efc]
    /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7ff37132389d]
    ======= Memory map: ========
    00400000-009a3000 r-xp 00000000 fd:03 4397000                            /usr/bin/nodejs
    00ba2000-00ba3000 r--p 005a2000 fd:03 4397000                            /usr/bin/nodejs
    00ba3000-00bad000 rw-p 005a3000 fd:03 4397000                            /usr/bin/nodejs
Owner

springmeyer commented Feb 15, 2012

latest backtrace looking a bit like: #948

Owner

springmeyer commented Feb 22, 2012

The above commit fixes the testcase. Valgrind output at https://gist.github.com/1880708 helped point me to this fix, but this needs further review. /cc @mrudowski1 && @artemp

/cc @yhahn for testing help.

springmeyer was assigned Feb 22, 2012

Owner

springmeyer commented Feb 22, 2012

re-opening and providing another, yet simplified testcase, because my "fix" in 5dee576 needs to be better understood before we call this done.

Owner

springmeyer commented Feb 22, 2012

New C++ testcase using nothing but a single image (along with setup instructions): https://gist.github.com/1886226

Owner

springmeyer commented Feb 22, 2012

Things I've tried that have been unsuccessful in revealing the root cause:

And here are the printed trees leading up to the crash: https://gist.github.com/1886745

Owner

springmeyer commented Feb 22, 2012

starting to look like a problem in the function passed to sort, as it does not conform to the strict weak ordering requirement:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=553#c6
http://www.sgi.com/tech/stl/StrictWeakOrdering.html

Owner

springmeyer commented Feb 22, 2012

Another workaround appears to be to downgrade the gcc/g++ version used on oneiric:

sudo apt-get install g++-4.4
cd <mapnik sources>
./configure CXX="g++-4.4"
make install
Owner

springmeyer commented Feb 22, 2012

better fix coming now, so rolled back to the original code skipping nodes with
< 3 pixels (5dee576) for now in 188ba77

Owner

springmeyer commented Feb 22, 2012

4a192c3 fixes for g++ 4.6 on ubuntu oneiric

@springmeyer springmeyer pushed a commit that referenced this issue Feb 23, 2012

Dane Springmeyer backport hextree sorting fix to 0.7.x series - refs #1087 6737bd9

@springmeyer springmeyer pushed a commit that referenced this issue Feb 24, 2012

Dane Springmeyer backport hextree sorting fix to 2.0.x series - refs #1087 ee8a150
Member

kkaefer commented Feb 24, 2012

Note that the reason for this bug was that the rgba::mean_sort_cmp::operator() function didn't adhere to a strict weak ordering criterion as required by std::sort. The old implementation is

bool compare(const rgba& x, const rgba& y) {
    int t1 = (int)x.a + x.r + x.g + x.b;
    int t2 = (int)y.a + y.r + y.g + y.b;
    if (t1 != t2) return t1 < t2;

    return (((int)x.a - y.a) >> 24) +
        (((int)x.r - y.r) >> 16) +
        (((int)x.g - y.g) >> 8) +
        (((int)x.b - y.b));
}

The strict weak ordering criterion means that if you have three items a, b, c: a ≮ b ∧ b ≮ c ⟹ a ≮ c. The second part of the old function didn't fulfill that criterion. To get past the first if, a, b and c's rgba values have to add up to the same number. To trigger a violation of the strict weak ordering, take this example (in a/r/g/b order):

  • a: 0/0/0/2
  • b: 1/0/0/1
  • c: 2/0/0/0

When comparing a and b, (0 - 1) >> 24 = -1 and 2 - 1 = 1, resulting in 0, so a ≮ b is true.
When comparing b and c, (1 - 2) >> 24 = -1 and 1 - 0 = 1, resulting in 0 so b ≮ c is true.
This means the premise is true.
When comparing a and c, (0 - 2) >> 24 = -1 and 2 - 0 = 2, resulting in 1 so a ≮ c is false (because a is < c).
This means the implication is false.

@mirecta mirecta pushed a commit to mapycz/mapnik that referenced this issue Mar 5, 2012

Dane Springmeyer ensure mean_sort_cmp adheres to strict weak ordering requirement - mo…
…re properly fixes #1087
4a192c3

springmeyer referenced this issue in mapbox/carmen-cache Dec 24, 2014

Closed

Crashing "real" test #7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment