intermittent segfault during grid_view encoding in tests #110

Closed
springmeyer opened this Issue Jul 12, 2012 · 13 comments

Comments

Projects
None yet
1 participant
@springmeyer
Owner

springmeyer commented Jul 12, 2012

I get this every 3rd or 4th time when running:

make test

It looks like memory corruption somewhere that is being exposed while we encode features. It only seems to happen when encoding grids for grid_view's.


Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libstdc++.6.dylib               0x00007fff930248b1 std::string::compare(std::string const&) const + 7
1   _mapnik.node                    0x00000001250a3ebd bool std::operator< <char, std::char_traits<char>, std::allocator<char> >(std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 29 (basic_string.h:2227)
2   _mapnik.node                    0x00000001250a3e21 std::less<std::string>::operator()(std::string const&, std::string const&) const + 33 (stl_function.h:227)
3   _mapnik.node                    0x00000001250edb48 std::_Rb_tree<std::string, std::string, std::_Identity<std::string>, std::less<std::string>, std::allocator<std::string> >::find(std::string const&) const + 88 (stl_tree.h:1399)
4   _mapnik.node                    0x00000001250eb3bd std::set<std::string, std::less<std::string>, std::allocator<std::string> >::find(std::string const&) const + 29 (stl_set.h:434)
5   _mapnik.node                    0x00000001250f6f29 _ZN11node_mapnikL14write_featuresIN6mapnik13hit_grid_viewINS1_9ImageDataIiEEEEEEvRKT_RN2v85LocalINS9_6ObjectEEERKSt6vectorINS6_11lookup_typeESaISF_EE + 2633 (js_grid_utils.hpp:209)
6   _mapnik.node                    0x00000001250f7932 GridView::EIO_AfterEncode(uv_work_s*) + 1026 (mapnik_grid_view.cpp:385)
7   node                            0x0000000100632735 uv__after_work + 69
8   node                            0x0000000100648557 eio_finish + 55
9   node                            0x0000000100645965 etp_poll + 373
10  node                            0x00000001006457e9 eio_poll + 9
11  node                            0x000000010063c917 uv_eio_want_poll_notifier_cb + 103
12  node                            0x000000010062ede4 uv__async + 68
13  node                            0x000000010063f009 ev_invoke_pending + 169
14  node                            0x000000010063f5f3 ev_run + 1491
15  node                            0x000000010062dff3 uv_run + 35
16  node                            0x00000001003e2c44 node::Start(int, char**) + 116
17  node                            0x00000001003ddce4 start + 52
@springmeyer

This comment has been minimized.

Show comment Hide comment
@springmeyer

springmeyer Jul 12, 2012

Owner

crashes before and after b128c7e, but backtrace is after b128c7e

Owner

springmeyer commented Jul 12, 2012

crashes before and after b128c7e, but backtrace is after b128c7e

@springmeyer

This comment has been minimized.

Show comment Hide comment
@springmeyer

springmeyer Jul 12, 2012

Owner

Oddly, just hitting the crashing test does not trigger a crash:

export NODE_PATH=./lib
mocha test/render_grid.test.js

Also, breaking out the should match expected output (async rendering view) as standalone, does not crash.

Sadly I cannot get gdb to follow the crashing process given the way that mocha works. I tried:

gdb node --args node /usr/local/bin/mocha -R Spec
(gdb) set follow-fork-mode child
(gdb) r # crashes, but gdb does not see....
Owner

springmeyer commented Jul 12, 2012

Oddly, just hitting the crashing test does not trigger a crash:

export NODE_PATH=./lib
mocha test/render_grid.test.js

Also, breaking out the should match expected output (async rendering view) as standalone, does not crash.

Sadly I cannot get gdb to follow the crashing process given the way that mocha works. I tried:

gdb node --args node /usr/local/bin/mocha -R Spec
(gdb) set follow-fork-mode child
(gdb) r # crashes, but gdb does not see....
@springmeyer

This comment has been minimized.

Show comment Hide comment
@springmeyer

springmeyer Jul 12, 2012

Owner

ah, ha, this works:

gdb node --args node node_modules/mocha/bin/_mocha -R Spec
Owner

springmeyer commented Jul 12, 2012

ah, ha, this works:

gdb node --args node node_modules/mocha/bin/_mocha -R Spec
@springmeyer

This comment has been minimized.

Show comment Hide comment
@springmeyer

springmeyer Jul 12, 2012

Owner

gdb bt:


(gdb) thread apply all bt

Thread 3 (process 80947):
#0  0x00007fff9371fbca in __psynch_cvwait ()
#1  0x00007fff8b559274 in _pthread_cond_wait ()
#2  0x000000010026bb0c in etp_proc ()
#3  0x00007fff8b5558bf in _pthread_start ()
#4  0x00007fff8b558b75 in thread_start ()

Thread 2 (process 80947):
#0  0x00007fff9371e6b6 in semaphore_wait_trap ()
#1  0x000000010018548b in v8::internal::RuntimeProfilerRateLimiter::SuspendIfNecessary ()
#2  0x0000000001001709 in ?? ()

Thread 1 (process 80947):
#0  0x00007fff930248b1 in std::string::compare ()
#1  0x0000000126028ebd in std::operator< <char, std::char_traits<char>, std::allocator<char> > (__lhs=@0x23, __rhs=@0x7fff5fbff0e8) at basic_string.h:2227
#2  0x0000000126028e21 in std::less<std::string>::operator() (this=0x12893c7a8, __x=@0x23, __y=@0x7fff5fbff0e8) at stl_function.h:227
#3  0x0000000126072b48 in std::_Rb_tree<std::string, std::string, std::_Identity<std::string>, std::less<std::string>, std::allocator<std::string> >::find (this=0x12893c7a8, __k=@0x7fff5fbff0e8) at stl_tree.h:1399
#4  0x00000001260703bd in std::set<std::string, std::less<std::string>, std::allocator<std::string> >::find (this=0x12893c7a8, __x=@0x7fff5fbff0e8) at stl_set.h:434
#5  0x000000012607bf29 in write_features (grid_type=@0x128928080, feature_data=@0x7fff5fbff570, key_order=@0x128912ab8) at js_grid_utils.hpp:208
#6  0x000000012607c932 in GridView::EIO_AfterEncode (req=0x128912a40) at mapnik_grid_view.cpp:385
#7  0x0000000100255735 in uv__after_work ()
#8  0x000000010026b557 in eio_finish ()
#9  0x0000000100268965 in etp_poll ()
#10 0x00000001002687e9 in eio_poll ()
#11 0x000000010025f917 in uv_eio_want_poll_notifier_cb ()
#12 0x0000000100251de4 in uv__async ()
#13 0x0000000100262009 in ev_invoke_pending ()
#14 0x00000001002625f3 in ev_run ()
#15 0x0000000100250ff3 in uv_run ()
#16 0x0000000100005c44 in node::Start ()
#17 0x0000000100000ce4 in start ()
Owner

springmeyer commented Jul 12, 2012

gdb bt:


(gdb) thread apply all bt

Thread 3 (process 80947):
#0  0x00007fff9371fbca in __psynch_cvwait ()
#1  0x00007fff8b559274 in _pthread_cond_wait ()
#2  0x000000010026bb0c in etp_proc ()
#3  0x00007fff8b5558bf in _pthread_start ()
#4  0x00007fff8b558b75 in thread_start ()

Thread 2 (process 80947):
#0  0x00007fff9371e6b6 in semaphore_wait_trap ()
#1  0x000000010018548b in v8::internal::RuntimeProfilerRateLimiter::SuspendIfNecessary ()
#2  0x0000000001001709 in ?? ()

Thread 1 (process 80947):
#0  0x00007fff930248b1 in std::string::compare ()
#1  0x0000000126028ebd in std::operator< <char, std::char_traits<char>, std::allocator<char> > (__lhs=@0x23, __rhs=@0x7fff5fbff0e8) at basic_string.h:2227
#2  0x0000000126028e21 in std::less<std::string>::operator() (this=0x12893c7a8, __x=@0x23, __y=@0x7fff5fbff0e8) at stl_function.h:227
#3  0x0000000126072b48 in std::_Rb_tree<std::string, std::string, std::_Identity<std::string>, std::less<std::string>, std::allocator<std::string> >::find (this=0x12893c7a8, __k=@0x7fff5fbff0e8) at stl_tree.h:1399
#4  0x00000001260703bd in std::set<std::string, std::less<std::string>, std::allocator<std::string> >::find (this=0x12893c7a8, __x=@0x7fff5fbff0e8) at stl_set.h:434
#5  0x000000012607bf29 in write_features (grid_type=@0x128928080, feature_data=@0x7fff5fbff570, key_order=@0x128912ab8) at js_grid_utils.hpp:208
#6  0x000000012607c932 in GridView::EIO_AfterEncode (req=0x128912a40) at mapnik_grid_view.cpp:385
#7  0x0000000100255735 in uv__after_work ()
#8  0x000000010026b557 in eio_finish ()
#9  0x0000000100268965 in etp_poll ()
#10 0x00000001002687e9 in eio_poll ()
#11 0x000000010025f917 in uv_eio_want_poll_notifier_cb ()
#12 0x0000000100251de4 in uv__async ()
#13 0x0000000100262009 in ev_invoke_pending ()
#14 0x00000001002625f3 in ev_run ()
#15 0x0000000100250ff3 in uv_run ()
#16 0x0000000100005c44 in node::Start ()
#17 0x0000000100000ce4 in start ()
@springmeyer

This comment has been minimized.

Show comment Hide comment
@springmeyer

springmeyer Jul 12, 2012

Owner

Looks like the crash is at attributes.find():

else if ( (attributes.find(feat_key_name) != attributes.end()) )

(gdb) frame 5
#5  0x000000012607bf29 in write_features (grid_type=@0x128928080, feature_data=@0x7fff5fbff570, key_order=@0x128912ab8) at js_grid_utils.hpp:208
208                 else if ( (attributes.find(feat_key_name) != attributes.end()) )
Current language:  auto; currently c++
(gdb) p feat_key_name
$1 = {
  _M_dataplus = {
    <std::allocator<char>> = {
      <__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, 
    members of std::basic_string<char>::_Alloc_hider: 
    _M_p = 0x128931af8 "NAME"
  }, 
  static npos = 18446744073709551615
}
(gdb) p attributes   
$2 = ('std::set<std::basic_string<char>, std::less<std::basic_string<char> >, std::allocator<std::basic_string<char> > >' &) @0x12893c7a8: {
  _M_t = {
    _M_impl = {
      <std::allocator<std::_Rb_tree_node<std::basic_string<char> > >> = {
        <__gnu_cxx::new_allocator<std::_Rb_tree_node<std::basic_string<char> > >> = {<No data fields>}, <No data fields>}, 
      members of std::_Rb_tree<std::basic_string<char>, std::basic_string<char>, std::_Identity<std::basic_string<char> >, std::less<std::basic_string<char> >, std::allocator<std::basic_string<char> > >::_Rb_tree_impl<std::less<std::basic_string<char> >, false>: 
      _M_key_compare = {
        <std::binary_function<std::basic_string<char>, std::basic_string<char>, bool>> = {<No data fields>}, <No data fields>}, 
      _M_header = {
        _M_color = std::_S_red, 
        _M_parent = 0x12891b8c0, 
        _M_left = 0x12891b8c0, 
        _M_right = 0x12891b8c0
      }, 
      _M_node_count = 1
    }
  }
}
(gdb) p attributes.size()
$3 = 1
(gdb) p attributes[0]    
Structure has no component named operator[].
(gdb) p attributes.begin()
$4 = {
  _M_node = 0x12891b8c0
}
(gdb) p *attributes.begin()
Attempt to take address of value not located in memory.
Owner

springmeyer commented Jul 12, 2012

Looks like the crash is at attributes.find():

else if ( (attributes.find(feat_key_name) != attributes.end()) )

(gdb) frame 5
#5  0x000000012607bf29 in write_features (grid_type=@0x128928080, feature_data=@0x7fff5fbff570, key_order=@0x128912ab8) at js_grid_utils.hpp:208
208                 else if ( (attributes.find(feat_key_name) != attributes.end()) )
Current language:  auto; currently c++
(gdb) p feat_key_name
$1 = {
  _M_dataplus = {
    <std::allocator<char>> = {
      <__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, 
    members of std::basic_string<char>::_Alloc_hider: 
    _M_p = 0x128931af8 "NAME"
  }, 
  static npos = 18446744073709551615
}
(gdb) p attributes   
$2 = ('std::set<std::basic_string<char>, std::less<std::basic_string<char> >, std::allocator<std::basic_string<char> > >' &) @0x12893c7a8: {
  _M_t = {
    _M_impl = {
      <std::allocator<std::_Rb_tree_node<std::basic_string<char> > >> = {
        <__gnu_cxx::new_allocator<std::_Rb_tree_node<std::basic_string<char> > >> = {<No data fields>}, <No data fields>}, 
      members of std::_Rb_tree<std::basic_string<char>, std::basic_string<char>, std::_Identity<std::basic_string<char> >, std::less<std::basic_string<char> >, std::allocator<std::basic_string<char> > >::_Rb_tree_impl<std::less<std::basic_string<char> >, false>: 
      _M_key_compare = {
        <std::binary_function<std::basic_string<char>, std::basic_string<char>, bool>> = {<No data fields>}, <No data fields>}, 
      _M_header = {
        _M_color = std::_S_red, 
        _M_parent = 0x12891b8c0, 
        _M_left = 0x12891b8c0, 
        _M_right = 0x12891b8c0
      }, 
      _M_node_count = 1
    }
  }
}
(gdb) p attributes.size()
$3 = 1
(gdb) p attributes[0]    
Structure has no component named operator[].
(gdb) p attributes.begin()
$4 = {
  _M_node = 0x12891b8c0
}
(gdb) p *attributes.begin()
Attempt to take address of value not located in memory.
@springmeyer

This comment has been minimized.

Show comment Hide comment
@springmeyer

springmeyer Jul 12, 2012

Owner

possible problem is that the only access to attributes in grid.hpp and grid_view.hpp is std::set<std::string> const& property_names() const while a true copy is expected here

Owner

springmeyer commented Jul 12, 2012

possible problem is that the only access to attributes in grid.hpp and grid_view.hpp is std::set<std::string> const& property_names() const while a true copy is expected here

@springmeyer

This comment has been minimized.

Show comment Hide comment
@springmeyer

springmeyer Jul 12, 2012

Owner

After mapnik/mapnik@da77505 I have run the tests about 50 times and seen one crash only - hopefully that one crash was a fluke.

Owner

springmeyer commented Jul 12, 2012

After mapnik/mapnik@da77505 I have run the tests about 50 times and seen one crash only - hopefully that one crash was a fluke.

@springmeyer

This comment has been minimized.

Show comment Hide comment
@springmeyer

springmeyer Jul 12, 2012

Owner

nope, actually still seeing crashes about every 1 in 4 runs.

Owner

springmeyer commented Jul 12, 2012

nope, actually still seeing crashes about every 1 in 4 runs.

@springmeyer

This comment has been minimized.

Show comment Hide comment
@springmeyer

springmeyer Jul 12, 2012

Owner

hmm, now testing with node v0.8.2 (previously was testing with v0.6.18), I cannot replicate with node v8.

Owner

springmeyer commented Jul 12, 2012

hmm, now testing with node v0.8.2 (previously was testing with v0.6.18), I cannot replicate with node v8.

@springmeyer

This comment has been minimized.

Show comment Hide comment
@springmeyer

springmeyer Jul 12, 2012

Owner

closing, as node v8 is a viable workaround.

Owner

springmeyer commented Jul 12, 2012

closing, as node v8 is a viable workaround.

@springmeyer springmeyer reopened this Jul 12, 2012

@springmeyer

This comment has been minimized.

Show comment Hide comment
@springmeyer

springmeyer Jul 12, 2012

Owner

More details -> with node v0.6.18 I am unable to replicate if I run the tests without turning on more printed output (e.g. using the default mocha dot reporter does not crash). So I cannot prompt a crash with just mocha, only mocha -R Spec (have not tried the other output formats).

Also if I disable these compositing.test.js test lines the crash also goes away.

As a last clue on some runs I get a bogus check mark from mocha -R Spec when the compositing.test.js is run:

Owner

springmeyer commented Jul 12, 2012

More details -> with node v0.6.18 I am unable to replicate if I run the tests without turning on more printed output (e.g. using the default mocha dot reporter does not crash). So I cannot prompt a crash with just mocha, only mocha -R Spec (have not tried the other output formats).

Also if I disable these compositing.test.js test lines the crash also goes away.

As a last clue on some runs I get a bogus check mark from mocha -R Spec when the compositing.test.js is run:

@springmeyer

This comment has been minimized.

Show comment Hide comment
@springmeyer

springmeyer Jul 12, 2012

Owner

running mocha -R tap also does not crash but on one run I saw another indication of memory corruption:

should not get here: key '' not found in grid feature properties
should not get here: key '' not found in grid feature properties
Owner

springmeyer commented Jul 12, 2012

running mocha -R tap also does not crash but on one run I saw another indication of memory corruption:

should not get here: key '' not found in grid feature properties
should not get here: key '' not found in grid feature properties
@springmeyer

This comment has been minimized.

Show comment Hide comment
@springmeyer

springmeyer Jul 12, 2012

Owner

okay, I see the problem now: the grid object itself is getting cleaned up by v8 while the grid_view is still in the event loop doing encoding. This can be prevented by putting a g->_ref() at https://github.com/mapnik/node-mapnik/blob/master/src/mapnik_grid.cpp#L217. Next step is to figure out a good way to unref() the grid when the grid_view is cleaned up.

Owner

springmeyer commented Jul 12, 2012

okay, I see the problem now: the grid object itself is getting cleaned up by v8 while the grid_view is still in the event loop doing encoding. This can be prevented by putting a g->_ref() at https://github.com/mapnik/node-mapnik/blob/master/src/mapnik_grid.cpp#L217. Next step is to figure out a good way to unref() the grid when the grid_view is cleaned up.

springmeyer pushed a commit that referenced this issue Jul 12, 2012

Dane Springmeyer
reference count Image objects in use by ImageView objects to avoid po…
…ssible scope issues resulting in segfaults when v8 garbage collects - closes #89, refs #110
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment