New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Meteor segfault randomly on reload (localy) #8648

Closed
damjuve opened this Issue Apr 28, 2017 · 100 comments

Comments

Projects
None yet
@damjuve

damjuve commented Apr 28, 2017

Hi,

When i use meteor locally, server start correctly but I face frequent segfault which appear randomly (i cant figure exactly what cause it).
It seems to occure when server reload (due to a file modification). But doesn't segfault on every reload, and doesn't segfault on same files (sometimes modificaiton on server side, sometimes on client side).

My RAM isn't full (about 10Gb free when segfault occure).
And here is all package i use :

`meteor-base@1.0.4 # Packages every Meteor app needs to have
mobile-experience@1.0.4 # Packages for a great mobile UX
mongo@1.1.16 # The database Meteor supports right now
blaze-html-templates@1.0.4 # Compile .html files into Meteor Blaze views
reactive-var@1.0.11 # Reactive variable for tracker
jquery@1.11.10 # Helpful client-side library
tracker@1.1.2 # Meteor's client-side reactive programming library

standard-minifier-css@1.3.4 # CSS minifier run for production mode
standard-minifier-js@2.0.0 # JS minifier run for production mode
es5-shim@4.6.15 # ECMAScript 5 compatibility for older browsers.
ecmascript@0.7.2 # Enable ECMAScript2015+ syntax in app code
shell-server@0.2.3 # Server-side component of the meteor shell command

accounts-ui@1.1.9
kadira:flow-router
zimme:active-route
underscore@1.0.10
kadira:blaze-layout
accounts-password@1.3.5
accounts-facebook@1.1.1
accounts-google@1.1.2
google-config-ui@1.0.0
facebook-config-ui@1.0.0
twbs:bootstrap
tanis:bootstrap-social
fortawesome:fontawesome
less@2.7.9
email@1.2.0
check@1.2.5
meteorhacks:ssr
arillo:flow-router-helpers
cfs:filesystem
ostrio:files
session@1.1.7
dburles:google-maps
sergeyt:typeahead
raix:handlebar-helpers
dburles:collection-helpers
manuel:reactivearray
aldeed:collection2-core
mizzao:bootboxjs
meteorhacks:aggregate
aldeed:autoform@6.0.0
summernote:summernote
anback:bootstrap-validator
aldeed:autoform-bs-datepicker
natestrauser:select2
rajit:bootstrap3-datepicker
natestrauser:font-awesome
mystor:device-detection
reactive-dict@1.1.8
aldeed:simple-schema
froala:editor
andrasph:clockpicker
aldeed:template-extension
tsega:bootstrap3-datetimepicker
`

I don't know which other informations i can give you, but i stay availabale if u need anything.

Regards.

@hwillson

This comment has been minimized.

Member

hwillson commented Apr 28, 2017

Hi @damjuve - what is the exact error message you're seeing? Does #8157 look similar to your problem?

@humbertocruz

This comment has been minimized.

humbertocruz commented May 1, 2017

same here after upgrade, the error is the same:

Client modified -- refreshing (x5)[1] 20587 segmentation fault meteor -s settings.json

changing the numer - 20587 now but also many others like: 19119, 16496.

@mjmasn

This comment has been minimized.

Contributor

mjmasn commented May 2, 2017

@hwillson I think this is related to one or more of #8002 #6241 #4446

Seems a lot more people are reporting these lately so it may be time to take a proper look. The difficulty will probably be replicating the issue reliably without being able to access confidential code, I'm happy to help if I can though.

Edit: Just for info, I see abort trap 6 5+ times a day (across various Meteor apps), and segmentation fault 11 roughly once a week. Latest Meteor release (1.4.4.1) and macOS Sierra (10.12.4) on a 2014 MacBook Pro Retina 13 (2.8GHz i5, 16 GB RAM).

@Herteby

This comment has been minimized.

Herteby commented May 2, 2017

I'm having the same issue, and people are reporting it on the forums too.
Forum thread

@damjuve

This comment has been minimized.

damjuve commented May 2, 2017

@hwillson - There is no error, just a "segmentation fault (core dumped)" message from the system.
I don't know how to track more log.
I posted the problem on meteor forum (https://forums.meteor.com/t/frequent-segmentation-fault-on-local/36064/3 and https://forums.meteor.com/t/segmentation-fault-11-meteor-crashing-during-development/35234/11). It seems to happens frequently since last meteor update, and on several system.
In my case, i work with 2 coworkers on the same app.
I face really frequent segfault (about 5+ by day), on Ubuntu 16.04.
One of my collegue face it less frequently (about 5+ by week), on Mac 10.12.4.
The last one face it only twice, on Debian 3.16.0-4.

Don't know if this is relevent. I can add also that i work alot on backside (server files), the mac user works on front & back, and the debian user mostley on front. But i noticed that the problem happens either on reloading server and client files.

I stay available if i can bring more clue about this.

Thanks for your time.

@hwillson

This comment has been minimized.

Member

hwillson commented May 2, 2017

Thanks all - these comments really help. Please keep posting any additional details you uncover.

@hwillson hwillson removed the review label May 3, 2017

@hwillson

This comment has been minimized.

Member

hwillson commented May 3, 2017

Would it be possible for anyone getting a core dump here to share that core dump?

@rlora

This comment has been minimized.

rlora commented May 6, 2017

Hi @hwillson,

I'm experiencing the issue as well on macOS. About twice a day.
Got this using lldb attached to the process do you think the core dump can be useful? It seems it could be in node.

Process 48213 resuming
Process 48213 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x3815fd003950)
    frame #0: 0x000000010035ab2e node`v8::internal::MarkCompactCollector::ProcessWeakCollections() + 302
node`v8::internal::MarkCompactCollector::ProcessWeakCollections:
->  0x10035ab2e <+302>: movl   0xa8(%r12,%rax), %eax
    0x10035ab36 <+310>: movzbl %cl, %ecx
    0x10035ab39 <+313>: btl    %ecx, %eax
    0x10035ab3c <+316>: jae    0x10035b020               ; <+1568>
@ffxsam

This comment has been minimized.

Contributor

ffxsam commented May 8, 2017

Just had seg fault 11 happen three times within 10 minutes. This is seriously harming productivity.

@abernix

This comment has been minimized.

Member

abernix commented May 8, 2017

It would be preferable for anyone "+1"-ing this to, at the very least, include the version of Meteor that they're using. While important when reporting any bug, it's particularly important on an issue of this nature as it could be tied to any number of underlying dependencies (Node.js, v8, etc.)

@rlora Your issue looks like a dangling pointer. If you're not doing manual memory management and relying on V8's automatic garbage collection, then it could certainly seem to be a V8 issue. As stated above, the Meteor version you're using would be helpful. Of course, there are no guarantees that any of the faults in this thread are related, but you're in an opportune position if you've captured it while attached. Can you get any more information from lldb with further inspection of the backtrace? bt might be a good debugger command to start with (and maybe frame info?).

@humbertocruz

This comment has been minimized.

humbertocruz commented May 8, 2017

here is occuring several times almost on every change on code and meteor restarts...
I don't know how to help to debug it. using version 1.4.4.2

@ffxsam

This comment has been minimized.

Contributor

ffxsam commented May 8, 2017

1.4.4.2 here as well.

@rlora

This comment has been minimized.

rlora commented May 9, 2017

@abernix I'm sorry, I'm trying to replicate the crash but I've spent the whole afternoon trying and works like a charm :(

I'm not doing manual memory management. I think the issue is in V8. I'm using Meteor 1.4.4.2

For people experiencing the crash in macOS follow this steps:

  1. Start your app meteor --settings settings.json
  2. In a new terminal window start lldb
  3. Inside lldb
(lldb) attach <PID>
(lldb) continue

Continue working in your app, if it crashes lldb will stop and your app will become unresponsive. But lldb will prevent the app from exiting with segmentation fault.

You can debug it from there. If you type bt you should get the backtrace.

I will continue trying to reproduce but maybe someone will capture it sooner.

@mjmasn

This comment has been minimized.

Contributor

mjmasn commented May 9, 2017

Helpful info @rlora, I'll run lldb today and see if I can come up with anything useful

@mjmasn

This comment has been minimized.

Contributor

mjmasn commented May 9, 2017

Who wants to decipher this?

Process 74267 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGPIPE
    frame #0: 0x00007fffcb2bb7e6 libsystem_kernel.dylib`write + 10
libsystem_kernel.dylib`write:
->  0x7fffcb2bb7e6 <+10>: jae    0x7fffcb2bb7f0            ; <+20>
    0x7fffcb2bb7e8 <+12>: movq   %rax, %rdi
    0x7fffcb2bb7eb <+15>: jmp    0x7fffcb2b2cd4            ; cerror
    0x7fffcb2bb7f0 <+20>: retq
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGPIPE
  * frame #0: 0x00007fffcb2bb7e6 libsystem_kernel.dylib`write + 10
    frame #1: 0x00000001007910f8 node`uv__write + 215
    frame #2: 0x0000000100790fe1 node`uv_write2 + 508
    frame #3: 0x000000010079151b node`uv_try_write + 110
    frame #4: 0x000000010068dcb3 node`node::StreamWrap::DoTryWrite(uv_buf_t**, unsigned long*) + 41
    frame #5: 0x000000010068ba3c node`int node::StreamBase::WriteString<(node::encoding)1>(v8::FunctionCallbackInfo<v8::Value> const&) + 1080
    frame #6: 0x000000010068e8fa node`void node::StreamBase::JSMethod<node::StreamWrap, &(int node::StreamBase::WriteString<(node::encoding)1>(v8::FunctionCallbackInfo<v8::Value> const&))>(v8::FunctionCallbackInfo<v8::Value> const&) + 72
    frame #7: 0x000000010017859f node`v8::internal::FunctionCallbackArguments::Call(void (*)(v8::FunctionCallbackInfo<v8::Value> const&)) + 159
    frame #8: 0x00000001001a0c04 node`v8::internal::MaybeHandle<v8::internal::Object> v8::internal::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::(anonymous namespace)::BuiltinArguments<(v8::internal::BuiltinExtraArguments)1>&) + 1060
    frame #9: 0x00000001001a386d node`v8::internal::Builtin_HandleApiCall(int, v8::internal::Object**, v8::internal::Isolate*) + 61
    frame #10: 0x000036af2a10963b
    frame #11: 0x000036af2bb64e8a
    frame #12: 0x000036af2bb6489a
    frame #13: 0x000036af2bba591b
    frame #14: 0x000036af2c0eddaf
    frame #15: 0x000036af2a109ff7
    frame #16: 0x000036af2bb33b98
    frame #17: 0x000036af2a109ff7
    frame #18: 0x000036af2c5f53ef
    frame #19: 0x000036af2a109ff7
    frame #20: 0x000036af2a1318f9
    frame #21: 0x000036af2a115b62
    frame #22: 0x00000001002d34c8 node`v8::internal::Invoke(bool, v8::internal::Handle<v8::internal::JSFunction>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) + 728
    frame #23: 0x000000010052440c node`v8::internal::Runtime_Apply(int, v8::internal::Object**, v8::internal::Isolate*) + 908
    frame #24: 0x000036af2a10963b
    frame #25: 0x000036af2bbd3bbf
    frame #26: 0x000036af2a109ff7
    frame #27: 0x000036af2c1a582f
    frame #28: 0x000036af2bba3706
    frame #29: 0x000036af2be4bac1
    frame #30: 0x000036af2a1318fd
    frame #31: 0x000036af2a115b62
    frame #32: 0x00000001002d34c8 node`v8::internal::Invoke(bool, v8::internal::Handle<v8::internal::JSFunction>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) + 728
    frame #33: 0x000000010015f4c4 node`v8::Function::Call(v8::Local<v8::Context>, v8::Local<v8::Value>, int, v8::Local<v8::Value>*) + 276
    frame #34: 0x000000010064f5d7 node`node::AsyncWrap::MakeCallback(v8::Local<v8::Function>, int, v8::Local<v8::Value>*) + 609
    frame #35: 0x000000010069137f node`node::TimerWrap::OnTimeout(uv_timer_s*) + 127
    frame #36: 0x0000000100792bce node`uv__run_timers + 38
    frame #37: 0x00000001007881e2 node`uv_run + 580
    frame #38: 0x0000000100661aa1 node`node::Start(int, char**) + 735
    frame #39: 0x0000000100001834 node`start + 52
@mjmasn

This comment has been minimized.

Contributor

mjmasn commented May 9, 2017

The previous comment was a seg fault 11 on one app, I think this one was an abort trap 6 on a different app:

It looks like the meteor tool crashes first, followed by the actual app...

Output of ps aux | grep node | grep ZZZ:
(Replaced ports, paths and app name with XXXX / YYY / ZZZ)

mike             73815  56.4  4.6  3918388 777380 s001  SX    9:40am  37:10.75 /Users/mike/.meteor/packages/meteor-tool/.1.4.4_2.r4ho8o++os.osx.x86_64+web.browser+web.cordova/mt-os.osx.x86_64/dev_bundle/bin/node /Users/mike/.meteor/packages/meteor-tool/.1.4.4_2.r4ho8o++os.osx.x86_64+web.browser+web.cordova/mt-os.osx.x86_64/tools/index.js --port XXXX --settings ../settings.json
mike             81476   1.5  0.7  3199824 113484 s001  S    11:32am   0:10.58 /Users/mike/.meteor/packages/meteor-tool/.1.4.4_2.r4ho8o++os.osx.x86_64+web.browser+web.cordova/mt-os.osx.x86_64/dev_bundle/bin/node /Users/mike/YYY/ZZZ/.meteor/local/build/main.js

meteor-tool:

Process 73815 stopped
* thread #11, stop reason = signal SIGABRT
    frame #0: 0x00007fffcb2b9d42 libsystem_kernel.dylib`__pthread_kill + 10
libsystem_kernel.dylib`__pthread_kill:
->  0x7fffcb2b9d42 <+10>: jae    0x7fffcb2b9d4c            ; <+20>
    0x7fffcb2b9d44 <+12>: movq   %rax, %rdi
    0x7fffcb2b9d47 <+15>: jmp    0x7fffcb2b2caf            ; cerror_nocancel
    0x7fffcb2b9d4c <+20>: retq
(lldb) bt
* thread #11, stop reason = signal SIGABRT
  * frame #0: 0x00007fffcb2b9d42 libsystem_kernel.dylib`__pthread_kill + 10
    frame #1: 0x00007fffcb3a75bf libsystem_pthread.dylib`pthread_kill + 90
    frame #2: 0x00007fffcb21f420 libsystem_c.dylib`abort + 129
    frame #3: 0x00000001007926a2 node`uv_cond_wait + 20
    frame #4: 0x000000010078617b node`worker + 227
    frame #5: 0x0000000100792300 node`uv__thread_start + 25
    frame #6: 0x00007fffcb3a49af libsystem_pthread.dylib`_pthread_body + 180
    frame #7: 0x00007fffcb3a48fb libsystem_pthread.dylib`_pthread_start + 286
    frame #8: 0x00007fffcb3a4101 libsystem_pthread.dylib`thread_start + 13
(lldb) continue
Process 73815 resuming
Process 73815 exited with status = 0 (0x00000000) Terminated due to signal 6

the app:

Process 81476 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGPIPE
    frame #0: 0x00007fffcb2bb7e6 libsystem_kernel.dylib`write + 10
libsystem_kernel.dylib`write:
->  0x7fffcb2bb7e6 <+10>: jae    0x7fffcb2bb7f0            ; <+20>
    0x7fffcb2bb7e8 <+12>: movq   %rax, %rdi
    0x7fffcb2bb7eb <+15>: jmp    0x7fffcb2b2cd4            ; cerror
    0x7fffcb2bb7f0 <+20>: retq
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGPIPE
  * frame #0: 0x00007fffcb2bb7e6 libsystem_kernel.dylib`write + 10
    frame #1: 0x00000001007910f8 node`uv__write + 215
    frame #2: 0x0000000100790fe1 node`uv_write2 + 508
    frame #3: 0x000000010079151b node`uv_try_write + 110
    frame #4: 0x000000010068dcb3 node`node::StreamWrap::DoTryWrite(uv_buf_t**, unsigned long*) + 41
    frame #5: 0x000000010068ba3c node`int node::StreamBase::WriteString<(node::encoding)1>(v8::FunctionCallbackInfo<v8::Value> const&) + 1080
    frame #6: 0x000000010068e8fa node`void node::StreamBase::JSMethod<node::StreamWrap, &(int node::StreamBase::WriteString<(node::encoding)1>(v8::FunctionCallbackInfo<v8::Value> const&))>(v8::FunctionCallbackInfo<v8::Value> const&) + 72
    frame #7: 0x000000010017859f node`v8::internal::FunctionCallbackArguments::Call(void (*)(v8::FunctionCallbackInfo<v8::Value> const&)) + 159
    frame #8: 0x00000001001a0c04 node`v8::internal::MaybeHandle<v8::internal::Object> v8::internal::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::(anonymous namespace)::BuiltinArguments<(v8::internal::BuiltinExtraArguments)1>&) + 1060
    frame #9: 0x00000001001a386d node`v8::internal::Builtin_HandleApiCall(int, v8::internal::Object**, v8::internal::Isolate*) + 61
    frame #10: 0x000014e493e0963b
    frame #11: 0x000014e495dea48a
    frame #12: 0x000014e495de9e9a
    frame #13: 0x000014e495de9665
    frame #14: 0x000014e495d48dda
    frame #15: 0x000014e496eb363c
    frame #16: 0x000014e496e79da1
    frame #17: 0x000014e496e79aa2
    frame #18: 0x000014e493e09ff7
    frame #19: 0x000014e496e4f792
    frame #20: 0x000014e496e78eb2
    frame #21: 0x000014e493e09ff7
    frame #22: 0x000014e496e92a47
    frame #23: 0x000014e493e09ff7
    frame #24: 0x000014e496909f62
    frame #25: 0x000014e496909ca8
    frame #26: 0x000014e493e345e7
    frame #27: 0x000014e496eda72f
    frame #28: 0x000014e493e09ff7
    frame #29: 0x000014e495d780b4
    frame #30: 0x000014e493e09ff7
    frame #31: 0x000014e4974810ca
    frame #32: 0x000014e495e8e705
    frame #33: 0x000014e496ee1de8
    frame #34: 0x000014e496ee19e5
    frame #35: 0x000014e496ee1678
    frame #36: 0x000014e496eda860
    frame #37: 0x000014e493e09ff7
    frame #38: 0x000014e495d780b4
    frame #39: 0x000014e493e09ff7
    frame #40: 0x000014e4974810ca
    frame #41: 0x000014e495e8e705
    frame #42: 0x000014e496ee1de8
    frame #43: 0x000014e496ee19e5
    frame #44: 0x000014e496ee1678
    frame #45: 0x000014e496eda860
    frame #46: 0x000014e493e09ff7
    frame #47: 0x000014e495d780b4
    frame #48: 0x000014e493e09ff7
    frame #49: 0x000014e4974810ca
    frame #50: 0x000014e495e8e705
    frame #51: 0x000014e496ee1de8
    frame #52: 0x000014e496ee19e5
    frame #53: 0x000014e496ee1678
    frame #54: 0x000014e496eda860
    frame #55: 0x000014e493e09ff7
    frame #56: 0x000014e495d780b4
    frame #57: 0x000014e493e09ff7
    frame #58: 0x000014e4974810ca
    frame #59: 0x000014e495e8e705
    frame #60: 0x000014e496ee1de8
    frame #61: 0x000014e496ee19e5
    frame #62: 0x000014e496ee1678
    frame #63: 0x000014e496eda860
    frame #64: 0x000014e496edb777
    frame #65: 0x000014e493e09ff7
    frame #66: 0x000014e495da7ee7
    frame #67: 0x000014e4974dfcc1
    frame #68: 0x000014e495d635c9
    frame #69: 0x000014e495dc60e3
    frame #70: 0x000014e495d385e7
    frame #71: 0x000014e493e318fd
    frame #72: 0x000014e493e15b62
    frame #73: 0x00000001002d34c8 node`v8::internal::Invoke(bool, v8::internal::Handle<v8::internal::JSFunction>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) + 728
    frame #74: 0x000000010015f4c4 node`v8::Function::Call(v8::Local<v8::Context>, v8::Local<v8::Value>, int, v8::Local<v8::Value>*) + 276
    frame #75: 0x000000010064f6f3 node`node::AsyncWrap::MakeCallback(v8::Local<v8::Function>, int, v8::Local<v8::Value>*) + 893
    frame #76: 0x000000010068ae90 node`node::StreamBase::EmitData(long, v8::Local<v8::Object>, v8::Local<v8::Object>) + 224
    frame #77: 0x00000001006b198e node`node::TLSWrap::ClearOut() + 228
    frame #78: 0x00000001006b26c8 node`node::TLSWrap::DoRead(long, uv_buf_t const*, uv_handle_type) + 112
    frame #79: 0x000000010068db87 node`node::StreamWrap::OnReadCommon(uv_stream_s*, long, uv_buf_t const*, uv_handle_type) + 127
    frame #80: 0x000000010078f87e node`uv__stream_io + 1235
    frame #81: 0x0000000100797004 node`uv__io_poll + 1621
    frame #82: 0x00000001007880df node`uv_run + 321
    frame #83: 0x0000000100661aa1 node`node::Start(int, char**) + 735
    frame #84: 0x0000000100001834 node`start + 52
@abernix

This comment has been minimized.

Member

abernix commented May 9, 2017

@mjmasn Thanks for providing that. That does look like it's stemming from the meteor tool itself, and your own app is simply being terminated (indicated as a SIGPIPE term, since it's a child process) as a by-product of that SIGABRT. This project is upgraded to Meteor 1.4.4.2, correct? (assuming based on #8630). The libuv involvement certainly make me lean toward file watching issues, something you've brought up recently in #8002 (comment).

@mjmasn

This comment has been minimized.

Contributor

mjmasn commented May 9, 2017

@abernix: yep, yep and yep ;)

Another possibly related issue (relating to gulp so could be another vote for this being a file watching issue): nodejs/node#10163. This could definitely be a bug in node / libuv...

Is there anything I can do to dig deeper into this? I'm out of the country for a week without my laptop after today but happy to do anything that might help at the back end of next week :)

@abernix

This comment has been minimized.

Member

abernix commented May 9, 2017

@mjmasn Node 6 still uses the same version of libuv otherwise I'd suggest taking a very experimental walk through #6923. This should be in NO way suggested as a work-around for anyone experiencing this problem as Meteor 1.6 is currently extremely experimental and unsuitable for development or production. You could consider applying the patch in the issue you linked to, though you'd need to do some even more experimental building of the dev bundle with the patched version, by making modifications to generate_dev_bundle.sh. It's a bit of the wild-west if you choose to embark on that adventure though. :)

@humbertocruz

This comment has been minimized.

humbertocruz commented May 9, 2017

i got that with lldb:

Process 65164 stopped

  • thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
    frame #0: 0x00007fffcaeddd96 libsystem_kernel.dylibkevent + 10 libsystem_kernel.dylibkevent:
    -> 0x7fffcaeddd96 <+10>: jae 0x7fffcaeddda0 ; <+20>
    0x7fffcaeddd98 <+12>: movq %rax, %rdi
    0x7fffcaeddd9b <+15>: jmp 0x7fffcaed5caf ; cerror_nocancel
    0x7fffcaeddda0 <+20>: retq

(lldb) continue
Process 65164 resuming
Process 65164 stopped

  • thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGUSR2
    frame #0: 0x00007fffcaedd1ce libsystem_kernel.dylib__sigsuspend + 10 libsystem_kernel.dylib__sigsuspend:
    -> 0x7fffcaedd1ce <+10>: jae 0x7fffcaedd1d8 ; <+20>
    0x7fffcaedd1d0 <+12>: movq %rax, %rdi
    0x7fffcaedd1d3 <+15>: jmp 0x7fffcaed5cd4 ; cerror
    0x7fffcaedd1d8 <+20>: retq
    (lldb) continue
    Process 65164 resuming
    Process 65164 stopped
  • thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGUSR2
    frame #0: 0x00007fffcaedd1ce libsystem_kernel.dylib__sigsuspend + 10 libsystem_kernel.dylib__sigsuspend:
    -> 0x7fffcaedd1ce <+10>: jae 0x7fffcaedd1d8 ; <+20>
    0x7fffcaedd1d0 <+12>: movq %rax, %rdi
    0x7fffcaedd1d3 <+15>: jmp 0x7fffcaed5cd4 ; cerror
    0x7fffcaedd1d8 <+20>: retq
    (lldb) continue
    Process 65164 resuming
@rlora

This comment has been minimized.

rlora commented May 10, 2017

@abernix I was able to catch a couple of segmentation faults but the signal I'm getting is different: EXC_BAD_ACCESS, apparently related to garbage collection. I got the signal a couple of times followed by a segmentation fault.

This is the frame: v8::internal::MarkCompactCollector::ProcessWeakCollections() + 302

@humbertocruz

This comment has been minimized.

humbertocruz commented May 12, 2017

I just notice that right now I got the segmentation falt when tried to load a template from another one and it has a typo in the name - I wrote "+loading" and it was "+loadingView"

I'm using jade instead html

@arggh

This comment has been minimized.

Contributor

arggh commented May 17, 2017

1.4.2.3 and getting Abort trap 6 several times a day.

@arggh

This comment has been minimized.

Contributor

arggh commented Aug 23, 2017

I tried the newest beta release 1.5.2-beta.13 and for a while I thought this got fixed, but...

=> Client modified -- refreshing (x16)
I20170823-13:29:14.856(3)? Serving logger on /logger
I20170823-13:29:14.871(3)? Processing jobs disabled.
=> Meteor server restarted
I20170823-13:30:39.247(3)? Serving logger on /logger
I20170823-13:30:39.263(3)? Processing jobs disabled.
=> Meteor server restarted
=> Client modified -- refreshing (x51)Segmentation fault: 11
arggh@ 🚀  :~/Development/app$ meteor --version
Meteor 1.5.2-beta.13                          
arggh@ 🚀  :~/Development/app$ meteor node --version
v4.8.4
arggh@ 🚀  :~/Development/app$
@abernix

This comment has been minimized.

Member

abernix commented Aug 23, 2017

@arggh That's unfortunate, though I know it's certainly fixed other segmentation faults so your particular case must be different. If you run ulimit -c unlimited on your (Mac? *nix?) machine, and continue to develop it until it happens again it should write a core dump (The default configuration usually doesn't as ulimit -c is set to 0). I won't advocate publicly posting a core dump unless you're confident it doesn't contain sensitive or proprietary information, but if you'd be willing to upload and share it with me privately I will try to take a look at it. I can be messaged on the Meteor forums with the same username. It'd be important to also let me know the exact version of Meteor which produced the core dump.

I'd also suggest trying Meteor 1.6 beta releases if that's an option for you!

@TheRealNate

This comment has been minimized.

TheRealNate commented Aug 29, 2017

Testing on 1.5.2-rc.2, will edit with results.

EDIT: So far so good, we had really frequent segmentation faults and haven't had any yet

@abernix

This comment has been minimized.

Member

abernix commented Sep 11, 2017

I would say that this issue is lessened with Meteor 1.5.2, but that we're not out of the woods. Specifically, I believe the

v8::internal::MarkCompactCollector::ProcessWeakCollections()

crash should be fixed thanks to #9031, however, there does seem to be another issue (also in similar V8 garbage-collection code) which (based on information I've analyzed privately with @arggh) appears to be the same as nodejs/node#3715 and https://bugs.chromium.org/p/chromium/issues/detail?id=408380 which have both been dismissed upstream due to lack of reproduction.

Anyone continuing to experience crashes with Meteor 1.5.2, please do write back and provide any additional information you might have about your segmentation fault. If you're on macOS, consider posting (as a Gist and linking here) the file generated in ~/Library/Logs/DiagnosticReports/node_<TIMESTAMP>_<HOSTNAME>.crash after the segmentation fault occurs.

I'll leave this particular issue open until @arggh or I are able to write up a new issue for the v8::internal::PointersUpdatingVisitor::VisitPointer failure. And truly, thank you @arggh for working diligently and thoroughly to help me diagnose his segmentation faults. 👏

@danwild

This comment has been minimized.

danwild commented Sep 15, 2017

Hmm, I'm seeing this a few times a day now too (OSX):

=> App running at: http://localhost:3000/
=> Client modified -- refreshing (x4)
=> Meteor server restarted                    
=> Client modified -- refreshingSegmentation fault: 11
$  meteor --version
Meteor 1.5.1  
$ meteor node --version
v4.8.4

Updating to 1.5.2 now, will report back if issue persists.

@abernix

This comment has been minimized.

Member

abernix commented Sep 15, 2017

@danwild and others: If you're experiencing this problem on a regular basis, as a matter of experimentation would you please try running setting the TOOL_NODE_FLAGS environment variable to be --no-expose-gc and see if the problem goes away?

On macOS this can be done with:

$ TOOL_NODE_FLAGS="--no-expose-gc" meteor ... # usual arguments here.

Reason being: the comments I listed here. Meteor does use the --expose-gc flag in order to afford ourselves better management of memory within the development process, however it may be causing the problems you're experiencing, presumably because an issue in V8 (as those referenced issues allude to).

By passing --no-expose-gc, you'll override our enabling of this (which we can detect) but potentially increase your memory usage (though, hopefully not to the point where you'll be exceeding any memory allocations).

/cc @arggh this should be an easier way of accomplishing the same change I had you make to your meteor command in your dev_bundle. 😉

@abernix

This comment has been minimized.

Member

abernix commented Sep 15, 2017

Oh, and please do report back with your findings, @danwild!

@danwild

This comment has been minimized.

danwild commented Sep 19, 2017

Well, two days of dev since upgrading to Meteor 1.5.2 and I haven't seen any segfaults.
The update seems to have fixed the issue for me - thanks all!

@abernix

This comment has been minimized.

Member

abernix commented Sep 19, 2017

@danwild Is that with the TOOL_NODE_FLAGS="--no-expose-gc" environment variable, or just by upgrading to Meteor 1.5.2 alone?

@danwild

This comment has been minimized.

danwild commented Sep 19, 2017

Just the meteor 1.5.2 update @abernix, hadn't seen the error again so didn't seem to need --no-expose-gc

@derwaldgeist

This comment has been minimized.

derwaldgeist commented Sep 21, 2017

+1 on Meteor 1.5.0. Today, I got the SIGABRT for the first time without any reload, it occured after the server received data via the websockets connection (method call).

@sirpy

This comment has been minimized.

sirpy commented Oct 22, 2017

I'm getting segmentation faults in production looks like code that uses Set and Map.
When i started running the production code with meter 1.5.2.2 using command line "meteor node main.js" the crashes stopped. (before i was using mup with abrenix-base meteord image)
Does the node bug supposed to happen only when running meteor in development?

@MichaelJCole

This comment has been minimized.

MichaelJCole commented Oct 22, 2017

@sirpy, I don't have the answer, but I'm interested in it. Could you show some example code: Set/Map from what?

@sirpy

This comment has been minimized.

sirpy commented Oct 22, 2017

@sirpy

This comment has been minimized.

sirpy commented Oct 29, 2017

I've tried running the server with meteor patched node 4.8.4, (meteor node main.js) and the segmentation fault crashes stopped.

@abernix abernix referenced this issue Nov 7, 2017

Merged

Release 1.5.4 #9320

@abernix

This comment has been minimized.

Member

abernix commented Nov 8, 2017

@sirpy Glad to hear that! For what it's worth, Node 4.8.6 now officially contains the Meteor-supplied fix which we patched into Meteor's Node 4.8.4. See #9320 for more information.

@arggh

This comment has been minimized.

Contributor

arggh commented Apr 3, 2018

Reporting just in case: Meteor 1.6.1 and still getting Abort trap: 6 randomly ~once or twice a day. Now Meteor is also crashing without an error message sometimes, as if CTRL+C was pressed. I'm observing this on all three apps I'm currently working on: one of them huge & two tiny.

@abernix

This comment has been minimized.

Member

abernix commented Apr 19, 2018

@arggh Was this with a deployed version of a Meteor 1.6.1 app? (That is to say, a built app being ran with node directly?) If so, which Node.js version?

@arggh

This comment has been minimized.

Contributor

arggh commented Apr 19, 2018

@abernix Nope, they happen during local development. I haven't yet deployed with Meteor 1.6.x to any actual live app, should I expect/fear these crashes to appear in the deployed version with 1.6.1?

@s7dhansh

This comment has been minimized.

Contributor

s7dhansh commented Apr 22, 2018

I started facing this from today - happens every 10 mins approx. I did not do a meteor update today, just a ubuntu update (I am on Bionic beta), so it is highly likely that the cause is os specific, esp. since I am also getting chrome segfaults.

Still, since I am at a loss on figuring out a fix, any help would be totally appreciated.
Meteor: v1.6.1.1
Meteor node: v8.11.1

--no-expose-gc does not help

segfault at d ip 00007f721ac9646c sp 00007fff2d860c50 error 4 in libnode.so[7f721a116000+137e000]
trap invalid opcode ip:7fd3aa039d88 sp:7fd3a4af3288 error:0 in libc-2.27.so[7fd3a9eaf000+1e7000]
@raza2022

This comment has been minimized.

raza2022 commented Jul 21, 2018

I also detect it's OS specific also it's directly indicating with Session Package?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment