Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

persistent crashes during plugin initialization #1459

Closed
springmeyer opened this Issue Sep 3, 2012 · 13 comments

Comments

Projects
None yet
2 participants
Owner

springmeyer commented Sep 3, 2012

Using TileMill, infrequently (1 in 10), but persistently (seen this over many months of use), I can trigger crashes related to mapnik and what looks like a race condition in plugin loading. I thought we solved this in #951, but some related but slightly more subtle issue seems to still persist.

  • Happens more frequently on "cold starts" where Mapnik has been recently recompiled and/or TileMill and Mapnik have not been run recently. I presume this is because the plugins will then tend to load more slowly and the issue is more likely to be triggered
  • Seems to happen only more often when the OGR plugin is involved, but this needs proving.
  • Last time I saw this (just now) I have a lot of plugins installed:

For reference: Darwin code for dlopen and friends: http://www.opensource.apple.com/source/dyld/dyld-132.13/src/

$ ls /usr/local/lib/mapnik/input/
csv.input   geojson.input  hello.input   occi.input  osm.input      python.input  shape.input
gdal.input  geos.input     kismet.input  ogr.input   postgis.input  raster.input  sqlite.input

In the TileMill console I see:

[tilemill] Checking for new version of TileMill...
[tilemill] npm 
[tilemill] http GET https://registry.npmjs.org/tilemill
[tilemill] node(63021,0x147dd8000) malloc: *** error for object 0x7fac8ae018a0: pointer being freed was not allocated
[tilemill] *** set a breakpoint in malloc_error_break to debug
[tilemill] node(63021,0x147b4a000) malloc: *** error for object 0x7fac8ae018a0: double free
[tilemill] *** set a breakpoint in malloc_error_break to debug
[tilemill] node(63021,0x147e1b000) malloc: *** error for object 0x7fac8ae018a0: pointer being freed was not allocated
[tilemill] *** set a breakpoint in malloc_error_break to debug
[tilemill] npm
[tilemill]  http 304 https://registry.npmjs.org/tilemill
[tilemill] Latest version of TileMill is 0.9.1.

(NOTE: I presume the TileMill version checking output is unrelated to the crash, it just is other initialization code that is executing)

And the log in /Users/dane/Library/Logs/DiagnosticReports/node_2012-09-03-110116_ dane-2.crash: https://gist.github.com/springmeyer/c0b4d17635c7c22acc43

Owner

springmeyer commented Sep 13, 2012

no longer seeing any crashes after collapsing my built plugins down to: INPUT_PLUGINS = 'csv,gdal,geojson,ogr,osm,postgis,raster,shape,sqlite'. So, going to close this. Will re-open if I can isolate who is to blame, likely either occi, hello, python, geos, or kismet. I may be that f73168a fixed this too.

Owner

springmeyer commented Jan 14, 2013

still seeing these crashes, pretty sure it is do to a race condition when plugins are registered/loaded in different threads due to the async load_map call in TileMill/node-mapnik. re-opening.

@springmeyer springmeyer reopened this Jan 14, 2013

Contributor

strk commented Feb 4, 2013

I'm seeing the crash from nodejs, so not sure it can be due to multi-thread (isn't node single-threaded?)

Contributor

strk commented Feb 4, 2013

Dane: you mentioned this could be worked around by reading XML in sync mode.
I'm getting conditional jumps depending on uninitialized values error reports from valgrind in libxml, pretty much these ones: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=577135 -- maybe that's the cause

Contributor

strk commented Feb 6, 2013

I took a look at mapnik::singleton::instance and I can't see
a reason for it to lock the mutex after checking for pInstance_
being set:

https://github.com/mapnik/mapnik/blob/2.1.x/include/mapnik/utils.hpp#L132

Also, the DestroySingleton is not mutex-protected and the "destroyed_"
value is likely not needed.

Owner

springmeyer commented Feb 6, 2013

RE: single thread vs multi-thread: yes node is single threaded in the sense that there is a single thread running the main event loop. But asyc calls like fs.writeFile in node core or map.load in node-mapnik go into the the libuv thread pool

RE: uninitialized jumps - I doubt this is the problem/related, but yes I see those whenever I use valgrind with zlib/xml2 and I personally ignore them

RE: singleton code - I'm not sure, I did not write it. But I'm suspect of it being threadsafe when called from different threads concurrently.

@springmeyer springmeyer referenced this issue in mapbox/tilelive-mapnik Feb 7, 2013

Closed

Move (back) to synchronous map loading #58

springmeyer pushed a commit to mapbox/tilelive-mapnik that referenced this issue Feb 7, 2013

Owner

springmeyer commented Feb 9, 2013

roh, ruh. now able to replicate this crash without async map loading: https://gist.github.com/springmeyer/4743854

hitting it running tilelive-mapnik tests using image-pool branch. - the cause this time appears to be async image.clear() or grid.clear()

Contributor

strk commented Feb 11, 2013

Great ! Does locking the mutex as first thing in getSingleton help ? The paper you linked above only mentions why double-checking is still unsafe, but does report aggressive locking to be safe (if maybe somewhat slower)

Contributor

strk commented Feb 18, 2013

@springmeyer need a pull request for testing this with safe singleton ?

Owner

springmeyer commented Mar 13, 2013

see also #1536, makes me wonder if this is simply rendering starting while a map is still being loaded.

Owner

springmeyer commented Jun 25, 2014

just noting: I've not seen this recently.

@springmeyer springmeyer removed this from the Mapnik 2.1.1 milestone Sep 6, 2014

Owner

springmeyer commented Sep 6, 2014

still not seen recently, going to assume this was the cause of subtle memory corruption bug which is now fixed.

@springmeyer springmeyer closed this Sep 6, 2014

@davenquinn davenquinn referenced this issue in mapbox/mapnik-pool Jan 14, 2016

Open

Added option for synchronous map loading #7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment