Ember Try scenario failing on Travis with a Segmentation fault #360

samselikoff · 2019-05-31T14:29:57Z

Hi! I'm investigating a Travis failure in one of my Ember Try scenarios and am looking for any guidance. I have no idea if this has anything to do with Ember Try but figured I'd start here.

Here's the failure. If you look back at the full build you'll see the same failure for all Versioned tests.

When I tried one of them locally by running

ember try:one ember-lts-2.18

it passed with no problem.

My next guess was the Segmentation fault had something to do with Travis' cache. I thought maybe it was due to all the PRs Dependabot was opening. I went back to Travis, deleted all caches, and re-ran master. No change, the fault still happened.

I then thought it might have been due to a code change on my end, so I went back to the last-passing build and re-ran it. I saw the same failures on the Versioned tests.

Any idea for what could be going on? Is there possibly a memory leak that I'm not seeing locally but that is causing Travis to blow up?

Any help much appreciated!

The text was updated successfully, but these errors were encountered:

kategengler · 2019-05-31T14:47:47Z

Can you try running with DEBUG=ember-try* {{scenario-command}} on travis, so we can see where the segfault happens? It's at the end, after the results are printed; the only thing ember-try does at that point is cleanup.

rwjblue · 2019-05-31T14:57:59Z

FYI - @scalvert has been digging into this same issue (over in ember-app-scheduler repo). Not yet sure what is going on, and we can't get it to repro locally yet.

scalvert · 2019-05-31T16:14:25Z

Correct. I've got the same issue on both ember-app-scheduler and ember-lifeline. Specifically, ember-lifeline's HEAD of master was passing CI, and following the issues I saw in ember-app-scheduler and to test if it was a problem isolated to that repo I restarted the build job in travis to see if it segfaulted too. It did.

Steps I've taken to try to isolate (ember-app-scheduler/ember-app-scheduler#312):

Downgraded node to 8.15 (ember-app-scheduler/ember-app-scheduler@1e0c5fb)
Upgraded node to 10 (ember-app-scheduler/ember-app-scheduler@b86074f)
Downgraded eslint-plugin-prettier (ember-app-scheduler/ember-app-scheduler@4614811)
Downgraded ember-cli (ember-app-scheduler/ember-app-scheduler@5cdcd1b)
Upgraded ember-try to latest (ember-app-scheduler/ember-app-scheduler@9be9968)
Acquired debug access to ember-app-scheduler, triggered debug builds, SSHed into the box and been able to reproduce the segfault. I've been working with Travis support to see if I can get access to the core dumps)

As mentioned, I've been engaged with Travis support to try to investigate.

scalvert · 2019-05-31T16:15:10Z

Can you try running with DEBUG=ember-try* {{scenario-command}} on travis, so we can see where the segfault happens? It's at the end, after the results are printed; the only thing ember-try does at that point is cleanup.

I can try this when I'm SSHed into the box.

kategengler · 2019-05-31T16:21:46Z

It's also possible to try that by updating the Travis config on a branch.

Worth noting: The latest passing ember-cli-mirage's build was on the latest ember-try.

scalvert · 2019-05-31T16:25:27Z

Yep, upgrading ember-try to latest had no effect on the occurrence of segfaults.

kategengler · 2019-06-10T02:39:24Z

Any update on this?

scalvert · 2019-06-10T13:54:16Z

Getting closer. A pesky weekend got in the way of further debugging efforts. I plan to focus on this today.

scalvert · 2019-06-10T15:41:23Z

Here's the top of the stack from the core dump from ember-lifeline:

(llnode) v8 bt
 * thread #1: tid = 0, 0x00007ff13af81554 sharp.node`std::queue<std::string, std::deque<std::string, std::allocator<std::string> > >::~queue() + 532, name = 'ember', stop reason = signal SIGSEGV
    frame #0: 0x00007ff13af81554 sharp.node`std::queue<std::string, std::deque<std::string, std::allocator<std::string> > >::~queue() + 532
    frame #1: 0x00007ff13d1ea1a9 libc.so.6`??? + 217
    frame #2: 0x00007ff13d1ea1f5 libc.so.6`exit + 21
    frame #3: 0x00000000008ce31f node`node::Exit(v8::FunctionCallbackInfo<v8::Value> const&) + 111
    frame #4: 0x0000000000a98153 node`v8::internal::FunctionCallbackArguments::Call(void (*)(v8::FunctionCallbackInfo<v8::Value> const&)) + 403
    frame #5: 0x0000000000b0f37c node`v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, v8::internal::BuiltinArguments) + 332
    frame #6: 0x0000000000b0ffcf node`v8::internal::Builtin_HandleApiCall(int, v8::internal::Object**, v8::internal::Isolate*) + 175
    frame #7: 0x0000311f78f042fd <exit>
  * frame #8: 0x0000311f78fbedb6 process.exit(this=0x22321889cf9:<Object: process>, <Smi: 0>) at (external).js:140:26 fn=0x0000278746d26539
    frame #9: 0x0000311f78fbedb6 process.exit(this=0x22321889cf9:<Object: process>, <Smi: 0>) at /home/travis/build/ember-lifeline/ember-lifeline/node_modules/capture-exit/index.js:63:26 fn=0x0000021b2c7f18c9
    frame #10: 0x0000311f78fbedb6 tryToExit(this=0x39d479b822d1:<undefined>) at /home/travis/build/ember-lifeline/ember-lifeline/node_modules/exit/lib/exit.js:15:21 fn=0x00001949f5a51861
    frame #11: 0x0000311f78fbedb6 exit(this=0x39d479b822d1:<undefined>, <Smi: 0>, 0x39d479b822d1:<undefined>) at /home/travis/build/ember-lifeline/ember-lifeline/node_modules/exit/lib/exit.js:11:31 fn=0x0000359cd89084e9
    frame #12: 0x0000311f78f0535f <adaptor>
    frame #13: 0x0000311f78fbedb6 (anonymous)(this=0x39d479b822d1:<undefined>, <Smi: 0>) at /home/travis/build/ember-lifeline/ember-lifeline/node_modules/ember-cli/bin/ember:39:17 fn=0x0000391c89f1f009
    frame #14: 0x0000311f78fbedb6 tryCatcher(this=0x39d479b822d1:<undefined>) at /home/travis/build/ember-lifeline/ember-lifeline/node_modules/rsvp/dist/rsvp.js:322:22 fn=0x0000021b2c788481
    frame #15: 0x0000311f78f0535f <adaptor>
    frame #16: 0x0000311f78fbedb6 invokeCallback(this=0x39d479b822d1:<undefined>, <Smi: 1>, 0x391c89f1efc1:<Object: Promise>, 0x391c89f1f009:<function: (anonymous) at /home/travis/build/ember-lifeline/ember-lifeline/node_modules/ember-cli/bin/ember:39:17>, <Smi: 0>) at /home/travis/build/ember-lifeline/ember-lifeline/node_modules/rsvp/dist/rsvp.js:493:26 fn=0x0000021b2c788799
    frame #17: 0x0000311f78fbedb6 publish(this=0x39d479b822d1:<undefined>, 0x391c89f1eda9:<Object: Promise>) at /home/travis/build/ember-lifeline/ember-lifeline/node_modules/rsvp/dist/rsvp.js:463:19 fn=0x0000021b2c788751
    frame #18: 0x0000311f78fbedb6 flush(this=0x39d479b822d1:<undefined>) at /home/travis/build/ember-lifeline/ember-lifeline/node_modules/rsvp/dist/rsvp.js:2436:17 fn=0x0000021b2c788d01
    frame #19: 0x0000311f78fbedb6 _combinedTickCallback(this=0x39d479b822d1:<undefined>, 0x39d479b822d1:<undefined>, 0x21b2c788d01:<function: flush at /home/travis/build/ember-lifeline/ember-lifeline/node_modules/rsvp/dist/rsvp.js:2436:17>) at (external).js:130:33 fn=0x00002d1858398529
    frame #20: 0x0000311f78fbedb6 _tickCallback(this=0x22321889cf9:<Object: process>) at (external).js:152:25 fn=0x000002232188c821
    frame #21: 0x0000311f78f04239 <internal>
    frame #22: 0x0000311f78f04101 <entry>
    frame #23: 0x0000000000da7d6a node`v8::internal::Execution::Call(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) + 266
    frame #24: 0x0000000000a7a793 node`v8::Function::Call(v8::Local<v8::Context>, v8::Local<v8::Value>, int, v8::Local<v8::Value>*) + 355
    frame #25: 0x0000000000a89211 node`v8::Function::Call(v8::Local<v8::Value>, int, v8::Local<v8::Value>*) + 65
    frame #26: 0x00000000008ce228 node`node::InternalCallbackScope::Close() + 456
    frame #27: 0x00000000008cfa26 node`node::InternalMakeCallback(node::Environment*, v8::Local<v8::Object>, v8::Local<v8::Function>, int, v8::Local<v8::Value>*, node::async_context) + 198
    frame #28: 0x00000000008a5b08 node`node::AsyncWrap::MakeCallback(v8::Local<v8::Function>, int, v8::Local<v8::Value>*) + 120
    frame #29: 0x00000000008fcec9 node`node::(anonymous namespace)::After(uv_fs_s*) + 329
    frame #30: 0x00000000009b53f5 node`uv__work_done(handle=0x0000000002186b50) + 165 at threadpool.c:313
    frame #31: 0x00000000009b989b node`uv__async_io(loop=0x0000000002186aa0, w=<unavailable>, events=<unavailable>) + 267 at async.c:118
    frame #32: 0x00000000009ca5c0 node`uv__io_poll(loop=0x0000000002186aa0, timeout=6578) + 752 at linux-core.c:375
    frame #33: 0x00000000009ba265 node`uv_run(loop=0x0000000002186aa0, mode=UV_RUN_DEFAULT) + 405 at core.c:370
    frame #34: 0x00000000008d6815 node`node::Start(uv_loop_s*, int, char const* const*, int, char const* const*) + 1205
    frame #35: 0x00000000008d5b70 node`node::Start(int, char**) + 352
    frame #36: 0x00007ff13d1cff45 libc.so.6`__libc_start_main + 245
    frame #37: 0x000000000089f301 node`_start + 41

The top of the stack is capture-exit, which captures and ultimately calls process.exit. There's also a number of RSVP calls directly before. I'm going to inspect each frame to see if there's any more info.

kategengler · 2019-06-10T15:57:12Z

Did you ever try running with DEBUG=ember-try*?

scalvert · 2019-06-10T16:02:31Z

Yes, it didn't provide any useful information, unfortunately.

kategengler · 2019-06-10T17:07:10Z

I was mostly wondering what the last thing that happened before the segfault was?

scalvert · 2019-06-11T02:01:55Z

Well the tests complete, and the process seems to 'hang' after.

I stood up an Ubuntu image in Azure to attempt to replicate it there, mainly due to Travis' debug session having a timeout configured, which means the session will spontaneously end during debugging.

I was unable to reproduce the segfault in my server, though the process does hang for a significant portion of time after the tests complete successfully.

kategengler · 2019-06-11T15:35:59Z

I was wondering because after all the tests run there is a step where ember-try cleans up / reinstalls node_modules; it's possible the segfault is happening during that.

scalvert · 2019-06-11T16:10:53Z

Ah gotcha. @rwjblue @krisselden and I are chatting about it right now to see if we can determine the issue. It's now happening in @ember/test-helpers too :/

scalvert · 2019-06-11T16:48:28Z

I tried running with DEBUG=ember-try*, and the cleanup phase appears to complete without issue. I can now reproduce the segfault on my Ubuntu machine in Azure. I have a core dump and am inspecting it again.

scalvert · 2019-06-11T18:36:19Z

We've identified the issue. It stems from the ember-cli-favicon addon, which has a transitive dependency on the sharp node package, which is a library used for image processing.

azureuser@travis:~/ember-lifeline/ember-lifeline$ yarn why sharp
yarn why v1.16.0
[1/4] Why do we have the module "sharp"...?
[2/4] Initialising dependency graph...
[3/4] Finding dependency...
[4/4] Calculating file sizes...
=> Found "sharp@0.22.1"
info Reasons this module exists
   - "ember-cli-favicon#broccoli-favicon#favicons" depends on it
   - Hoisted from "ember-cli-favicon#broccoli-favicon#favicons#sharp"
info Disk size without dependencies: "31.6MB"
info Disk size with unique dependencies: "32.67MB"
info Disk size with transitive dependencies: "35.05MB"
info Number of shared dependencies: 46
Done in 2.34s.

It's the sharp library itself that is causing the segfault, as can be seen from the stack in llnode:

(llnode) v8 bt
 * thread #1: tid = 0, 0x00007ff13af81554 sharp.node`std::queue<std::string, std::deque<std::string, std::allocator<std::string> > >::~queue() + 532, name = 'ember', stop reason = signal SIGSEGV
    frame #0: 0x00007ff13af81554 sharp.node`std::queue<std::string, std::deque<std::string, std::allocator<std::string> > >::~queue() + 532
    frame #1: 0x00007ff13d1ea1a9 libc.so.6`??? + 217
    frame #2: 0x00007ff13d1ea1f5 libc.so.6`exit + 21
    frame #3: 0x00000000008ce31f node`node::Exit(v8::FunctionCallbackInfo<v8::Value> const&) + 111

In the favicons library, the sharp library was added in this commit itgalaxy/favicons@928524c#diff-b9cfc7f2cdf78a7f4b91a753d10865a2. Since ember-try uses --no-lockfile, we upgrade to the version of broccoli-favicon that pulls in the version of favicons that includes the sharp library.

Workaround to unblock:

use resolutions to pin to v5.3.0 of favicons

"resolutions": {
  "favicons": "5.3.0"
}

We're trying to figure out the best place to report this issue.

kategengler · 2019-06-11T18:46:25Z

Wow! So many levels...

(I am continually amazed anything ever works)

Full description of issue: ember-cli/ember-try#360 (comment)

stefanpenner · 2019-06-12T00:10:07Z

My guess it is related to this queue: https://github.com/lovell/sharp/blob/aa9b328778ef00971e883365ebedd480799394a2/src/common.cc#L420 and likely an issue with libc.so on the linux on travis.

vipsWarnings is a statically allocated variable on the sharp namespace, if my memory of c++ is correct, static variables such as this are destroyed in LIFO ordering when the main() function exists. Which seems to be when the issue is occurring.

I could be way off base, with a local reproduction it would likely be not too hard to figure out. If such a repro exists, i would recommeding:

recompile, confirm it still crashes
remove the whole warning related code which interacts with the queue, and see if it fixes
if that leads to something, narrow in.

stefanpenner · 2019-06-12T00:24:53Z

what version of glibc is on those linux boxes?

scalvert · 2019-06-12T00:32:41Z

(Ubuntu EGLIBC 2.19-0ubuntu6.15) 2.19

rwjblue · 2020-11-02T16:13:18Z

Going to close for now, happy to reopen if folks think this is still an issue.

scalvert mentioned this issue Jun 11, 2019

Fixing issue with segfaults during travis builds ember-lifeline/ember-lifeline#451

Merged

samselikoff added a commit to miragejs/ember-cli-mirage that referenced this issue Jun 11, 2019

Fix issue with segaults during Travis builds

6debee5

Full description of issue: ember-cli/ember-try#360 (comment)

samselikoff mentioned this issue Jun 11, 2019

Fix issue with segaults during Travis builds miragejs/ember-cli-mirage#1651

Merged

samselikoff added a commit to miragejs/ember-cli-mirage that referenced this issue Jun 11, 2019

Fix issue with segaults during Travis builds (#1651)

078cc33

Full description of issue: ember-cli/ember-try#360 (comment)

rwjblue closed this as completed Nov 2, 2020

SergeAstapov mentioned this issue Dec 14, 2021

Bump ember-cli-favicon from 2.0.0 to 3.0.0 miragejs/ember-cli-mirage#2276

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ember Try scenario failing on Travis with a Segmentation fault #360

Ember Try scenario failing on Travis with a Segmentation fault #360

samselikoff commented May 31, 2019

kategengler commented May 31, 2019

rwjblue commented May 31, 2019

scalvert commented May 31, 2019

scalvert commented May 31, 2019

kategengler commented May 31, 2019

scalvert commented May 31, 2019

kategengler commented Jun 10, 2019

scalvert commented Jun 10, 2019

scalvert commented Jun 10, 2019

kategengler commented Jun 10, 2019

scalvert commented Jun 10, 2019

kategengler commented Jun 10, 2019

scalvert commented Jun 11, 2019

kategengler commented Jun 11, 2019

scalvert commented Jun 11, 2019

scalvert commented Jun 11, 2019

scalvert commented Jun 11, 2019 •

edited

Loading

kategengler commented Jun 11, 2019

stefanpenner commented Jun 12, 2019 •

edited

Loading

stefanpenner commented Jun 12, 2019

scalvert commented Jun 12, 2019

rwjblue commented Nov 2, 2020

Ember Try scenario failing on Travis with a Segmentation fault #360

Ember Try scenario failing on Travis with a Segmentation fault #360

Comments

samselikoff commented May 31, 2019

kategengler commented May 31, 2019

rwjblue commented May 31, 2019

scalvert commented May 31, 2019

scalvert commented May 31, 2019

kategengler commented May 31, 2019

scalvert commented May 31, 2019

kategengler commented Jun 10, 2019

scalvert commented Jun 10, 2019

scalvert commented Jun 10, 2019

kategengler commented Jun 10, 2019

scalvert commented Jun 10, 2019

kategengler commented Jun 10, 2019

scalvert commented Jun 11, 2019

kategengler commented Jun 11, 2019

scalvert commented Jun 11, 2019

scalvert commented Jun 11, 2019

scalvert commented Jun 11, 2019 • edited Loading

kategengler commented Jun 11, 2019

stefanpenner commented Jun 12, 2019 • edited Loading

stefanpenner commented Jun 12, 2019

scalvert commented Jun 12, 2019

rwjblue commented Nov 2, 2020

scalvert commented Jun 11, 2019 •

edited

Loading

stefanpenner commented Jun 12, 2019 •

edited

Loading