Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory corruption when encoding images with hextree #1087

Closed
springmeyer opened this issue Feb 15, 2012 · 10 comments
Closed

memory corruption when encoding images with hextree #1087

springmeyer opened this issue Feb 15, 2012 · 10 comments
Assignees
Labels
Milestone

Comments

@springmeyer
Copy link
Member

  • leads to unpredictable crashes with new/delete.
  • Cannot replicate on OS X 10.7
  • Replicable on Ubuntu Oneiric 64 bit
  • testcase at https://gist.github.com/1881085
  • g++ 4.6
 g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.6.1/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.6.1-9ubuntu3' --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++,go --prefix=/usr --program-suffix=-4.6 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin --enable-objc-gc --disable-werror --with-arch-32=i686 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.6.1 (Ubuntu/Linaro 4.6.1-9ubuntu3)

Crashes are variable, but here is a common one:

   glibc detected *** node: malloc(): smallbin double linked list corrupted: 0x00000000021b17d0 ***
    ======= Backtrace: =========
    /lib/x86_64-linux-gnu/libc.so.6(+0x78a96)[0x7ff3712baa96]
    /lib/x86_64-linux-gnu/libc.so.6(+0x7accc)[0x7ff3712bcccc]
    /lib/x86_64-linux-gnu/libc.so.6(__libc_malloc+0x6d)[0x7ff3712be77d]
    /usr/lib/x86_64-linux-gnu/libstdc++.so.6(_Znwm+0x1d)[0x7ff371d524cd]
    /usr/lib/libmapnik.so.2.0(_ZN5boost16unordered_detail17hash_unique_tableINS0_3mapIN6mapnik4rgbaENS4_9hash_funcESt8equal_toIS4_ESaISt4pairIKS4_iEEEEEixERS9_+0x10a)[0x7ff34c8958ca]
    /usr/lib/libmapnik.so.2.0(_ZNK6mapnik7hextreeINS_4rgbaENS_10RGBAPolicyEE8quantizeERKS1_+0x3aa)[0x7ff34c895dfa]
    /usr/lib/libmapnik.so.2.0(_ZN6mapnik12save_as_png8ISoNS_10image_viewINS_9ImageDataIjEEEENS_7hextreeINS_4rgbaENS_10RGBAPolicyEEEEEvRT_RKT0_RKT1_RKSt6vectorINS_3rgbESaISI_EERKSH_IjSaIjEEii+0x29a)[0x7ff34c8960ba]
    /usr/lib/libmapnik.so.2.0(_ZN6mapnik16save_as_png8_hexISoNS_10image_viewINS_9ImageDataIjEEEEEEvRT_RKT0_iiiid+0x1008)[0x7ff34c8972d8]
    /usr/lib/libmapnik.so.2.0(_ZN6mapnik14save_to_streamINS_10image_viewINS_9ImageDataIjEEEEEEvRKT_RSoRKSs+0x273)[0x7ff34c89caf3]
    /usr/lib/libmapnik.so.2.0(_ZN6mapnik14save_to_stringINS_10image_viewINS_9ImageDataIjEEEEEESsRKT_RKSs+0x2f)[0x7ff34c89cd5f]
    /usr/share/tilemill/node_modules/mapnik/lib/_mapnik.node(_ZN9ImageView10EIO_EncodeEP7eio_req+0x5a)[0x7ff34cc77a8a]
    node[0x538ab0]
    /lib/x86_64-linux-gnu/libpthread.so.0(+0x7efc)[0x7ff3715e8efc]
    /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7ff37132389d]
    ======= Memory map: ========
    00400000-009a3000 r-xp 00000000 fd:03 4397000                            /usr/bin/nodejs
    00ba2000-00ba3000 r--p 005a2000 fd:03 4397000                            /usr/bin/nodejs
    00ba3000-00bad000 rw-p 005a3000 fd:03 4397000                            /usr/bin/nodejs
@springmeyer
Copy link
Member Author

latest backtrace looking a bit like: #948

@springmeyer
Copy link
Member Author

The above commit fixes the testcase. Valgrind output at https://gist.github.com/1880708 helped point me to this fix, but this needs further review. /cc @mrudowski1 && @artemp

/cc @yhahn for testing help.

@ghost ghost assigned springmeyer Feb 22, 2012
@springmeyer
Copy link
Member Author

re-opening and providing another, yet simplified testcase, because my "fix" in 5dee576 needs to be better understood before we call this done.

@springmeyer
Copy link
Member Author

New C++ testcase using nothing but a single image (along with setup instructions): https://gist.github.com/1886226

@springmeyer
Copy link
Member Author

Things I've tried that have been unsuccessful in revealing the root cause:

And here are the printed trees leading up to the crash: https://gist.github.com/1886745

@springmeyer
Copy link
Member Author

starting to look like a problem in the function passed to sort, as it does not conform to the strict weak ordering requirement:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=553#c6
http://www.sgi.com/tech/stl/StrictWeakOrdering.html

@springmeyer
Copy link
Member Author

Another workaround appears to be to downgrade the gcc/g++ version used on oneiric:

sudo apt-get install g++-4.4
cd <mapnik sources>
./configure CXX="g++-4.4"
make install

@springmeyer
Copy link
Member Author

better fix coming now, so rolled back to the original code skipping nodes with
< 3 pixels (5dee576) for now in 188ba77

@springmeyer
Copy link
Member Author

4a192c3 fixes for g++ 4.6 on ubuntu oneiric

@kkaefer
Copy link
Member

kkaefer commented Feb 24, 2012

Note that the reason for this bug was that the rgba::mean_sort_cmp::operator() function didn't adhere to a strict weak ordering criterion as required by std::sort. The old implementation is

bool compare(const rgba& x, const rgba& y) {
    int t1 = (int)x.a + x.r + x.g + x.b;
    int t2 = (int)y.a + y.r + y.g + y.b;
    if (t1 != t2) return t1 < t2;

    return (((int)x.a - y.a) >> 24) +
        (((int)x.r - y.r) >> 16) +
        (((int)x.g - y.g) >> 8) +
        (((int)x.b - y.b));
}

The strict weak ordering criterion means that if you have three items a, b, c: a ≮ b ∧ b ≮ c ⟹ a ≮ c. The second part of the old function didn't fulfill that criterion. To get past the first if, a, b and c's rgba values have to add up to the same number. To trigger a violation of the strict weak ordering, take this example (in a/r/g/b order):

  • a: 0/0/0/2
  • b: 1/0/0/1
  • c: 2/0/0/0

When comparing a and b, (0 - 1) >> 24 = -1 and 2 - 1 = 1, resulting in 0, so a ≮ b is true.
When comparing b and c, (1 - 2) >> 24 = -1 and 1 - 0 = 1, resulting in 0 so b ≮ c is true.
This means the premise is true.
When comparing a and c, (0 - 2) >> 24 = -1 and 2 - 0 = 2, resulting in 1 so a ≮ c is false (because a is < c).
This means the implication is false.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants