Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ensure builder.cleanup() waits on pending work #393

Merged
merged 7 commits into from Mar 14, 2019
Merged

Conversation

stefanpenner
Copy link
Contributor

@stefanpenner stefanpenner commented Mar 4, 2019

Prior to this, if cleanup was invoked while the build was cancelling, it would begin to do cleanup prior to the current step completing. Which would typically result in "missing directory" or "missing file" errors, which would then result in the build crashing and not completing it cleanup rather then exiting cleanly.

  • debug OOM which becomes apparent during mid-build cancellations (issue with: aggregateTime)
  • confirm
  • test

@stefanpenner
Copy link
Contributor Author

@oligriffiths this is likely the solution to the problem we spoke about on Wednesday. I finally had some time to debug it...

@oligriffiths
Copy link
Contributor

Looks sane. Fix tests and we’re good I think.

@stefanpenner
Copy link
Contributor Author

@oligriffiths aggregateTime seems to OOM when we don't crash during mid-build cancellation.
I'm investigating why, and see if I can fix...

@stefanpenner
Copy link
Contributor Author

Although this is green, I haven't been able to fix the OOM issue (or add a test here that demonstrates it) So we should hold-off until i, or someone else, has a chance to look into that.

@oligriffiths
Copy link
Contributor

@stefanpenner got a repo to demo the OOM?

@oligriffiths
Copy link
Contributor

oligriffiths commented Mar 8, 2019

Ok so I reproduced the issue: (ember build, wait for 'building...' then cmd+c)

ember build
Environment: development
cleaning up...
cleaning......
Cannot stop: not the current node.

<--- Last few GCs --->

[68590:0x102801000]    13909 ms: Mark-sweep 998.5 (1045.1) -> 828.5 (875.1) MB, 450.2 / 0.0 ms  allocation failure GC in old space requested
[68590:0x102801000]    14610 ms: Mark-sweep 828.5 (875.1) -> 828.4 (838.6) MB, 701.4 / 0.0 ms  last resort GC in old space requested
[68590:0x102801000]    15083 ms: Mark-sweep 828.4 (838.6) -> 828.4 (838.1) MB, 472.2 / 0.0 ms  last resort GC in old space requested


<--- JS stacktrace --->

==== JS stack trace =========================================

    0: ExitFrame [pc: 0x38b802e0427d]
Security context: 0x15b0e6c206a9 <JSObject>
    1: aggregateTime(aka aggregateTime) [/Users/goli/Projects/broccoli/lib/builder.js:~448] [pc=0x38b803389be9](this=0x15b0095022e1 <undefined>)
    2: buildHeimdallTree [/Users/goli/Projects/broccoli/lib/builder.js:400] [bytecode=0x15b0a04faca9 offset=21](this=0x15b0b9f13439 <EventEmitter map = 0x15b02a42ed51>,outputNodeWrapper=0x15b05fa22971 <NodeWrapper m...

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory

Note the Cannot stop: not the current node. line, it's important.

During cleaning, heimdall is being messed with as it is being started/stopped in the wrong order. What happens is a node starts being build, and heimdall.start() is called from beginNode fire in Broccoli. Then cleanup() is called and https://github.com/ember-cli/ember-cli/blob/master/lib/models/builder.js#L250 starts a new heimdall node, and then the endNode is fired for the currently building node, then the cleanup heimdall node is stopped.

So for now, I've added a _isCleaning flag to skip doing the heimdall doing the aggregate time.

➜  ember build
Environment: development
cleaning up...
cleaning......
Cannot stop: not the current node.
Built project successfully. Stored in "dist/".

So the same Cannot stop: not the current node. is output (issue still exists) but the OOM is gone.

lib/builder.js Outdated Show resolved Hide resolved
@oligriffiths
Copy link
Contributor

oligriffiths commented Mar 9, 2019

So, I added some debug statements to beginNode and endNode. Even when cancelling the ember-cli build, the broccoli build doesn't get cancelled. Example: ember-cli-build.js

'use strict';

const Plugin = require('broccoli-plugin');

class Sleep extends Plugin {
  build() {
    return new Promise(resolve => {
      console.log('Going to sleep');
      setTimeout(() => {
        console.log('Waking up');
        resolve();
      }, 2000);
    });
  }
}

module.exports = function(defaults) {
  return new Sleep([
    new Sleep([], { name: 'Sleep 1' })
  ], { name: 'Sleep 2' });
};
➜  ember-test git:(master) ✗ ember build
Environment: development
START Sleep 1
Going to sleep
cleaning up...
Waking up
DONE Sleep 1
Cannot stop: not the current node.
START Sleep 2
Going to sleep
Waking up
DONE Sleep 2
Built project successfully. Stored in "dist/".

CMD+C was initiated after the first Going to sleep, see cleaning up....
It completes after, but then the next node continues to build Sleep 2, this doesn't seem right...

@stefanpenner
Copy link
Contributor Author

@oligriffiths ya, looks like something else is fishy...

@oligriffiths
Copy link
Contributor

oligriffiths commented Mar 11, 2019

@stefanpenner yeah so broccoli-builder did call cancel() https://github.com/ember-cli/broccoli-builder/blob/0-18-x/lib/builder.js#L212

However Broccoli 2 does not, and if you do call cancel, you get an exception due to https://github.com/broccolijs/broccoli/blob/master/lib/builder.js#L109 and https://github.com/broccolijs/broccoli/blob/master/lib/builder.js#L119

I don't know what the intention behind those 2 lines is, can you shed some light? Seems like we need to do some refactoring.

Imo, the flow should be

  • build
  • start node
  • cleanup called
    • promise(waitOnNode).then(cancel).then(cleanup) => return promise & wait for node to finish
  • end node
  • cancel
  • cleanup
  • return

@stefanpenner
Copy link
Contributor Author

@oligriffiths I've fixed it

lib/builder.js Outdated Show resolved Hide resolved
Copy link
Contributor

@oligriffiths oligriffiths left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if this is enough.
Should we not be addressing the startNode/endNode issue. See comment.

lib/builder.js Outdated Show resolved Hide resolved
@stefanpenner stefanpenner force-pushed the wait-to-cleanup branch 2 times, most recently from c21f52a to 1728a72 Compare March 12, 2019 21:08
@stefanpenner
Copy link
Contributor Author

@oligriffiths updated, mind re-reviewing?

@oligriffiths
Copy link
Contributor

oligriffiths commented Mar 12, 2019

@stefanpenner something is still off. Trying with a new ember install, hitting CMD+C after building... is printed to console:

Environment: development
cleaning up...
Cannot stop: not the current node.
BUILD CANCELLED


Stack Trace and Error Report: /var/folders/v0/v1xr03j523d3py2sp5m4_gsm0000gn/T/error.dump.210e83fb537b3243555bd0c7f3062c40.log

// log
ERROR Summary:

  - broccoliBuilderErrorStack: [undefined]
  - code: [undefined]
  - codeFrame: [undefined]
  - errorMessage: BUILD CANCELLED
  - errorType: [undefined]
  - location:
    - column: [undefined]
    - file: [undefined]
    - line: [undefined]
  - message: BUILD CANCELLED
  - name: Cancelation
  - nodeAnnotation: [undefined]
  - nodeName: [undefined]
  - originalErrorMessage: [undefined]
  - stack: Cancelation: BUILD CANCELLED
    at CancelationRequest.throwIfRequested (/Users/goli/Projects/broccoli/lib/cancelation-request.js:17:13)
    at promiseFinally.then (/Users/goli/Projects/broccoli/lib/builder.js:115:48)
    at process._tickCallback (internal/process/next_tick.js:68:7)

Cannot stop: not the current node. means we have unbalanced heimdall nodes.

@stefanpenner
Copy link
Contributor Author

@oligriffiths ya we explicitly fail the build, as the build was not successful so the "BUILD CANCELLED" is expected. That being said, i cannot replicate the Cannot stop: not the current node message myself.

@oligriffiths
Copy link
Contributor

Adding a log to startNode and endNode you get this:

➜  ember-test git:(master) ember build
Environment: development
Start: EMPTY_MERGE_TREE
End: BroccoliMergeTrees
Start: ProcessedTemplateTree
End: Funnel
Start: /Users/goli/Projects/ember-test/app/styles
End: UnwatchedDir
Start: Funnel (styles)
End: Funnel
Start: /Users/goli/Projects/ember-test/tests
End: WatchedDir
Start: broccoli-persistent-filter:EslintValidationFilter
cleaning up...
End: broccoli-persistent-filter:EslintValidationFilter
Cannot stop: not the current node.
BUILD CANCELLED

@stefanpenner
Copy link
Contributor Author

@oligriffiths we should likely make BUILD CANCELLED less screaming case.

@stefanpenner
Copy link
Contributor Author

@oligriffiths are you sure the heimdall warning is from broccoli, and not something in ember-cli?

@stefanpenner stefanpenner merged commit 898ff1a into master Mar 14, 2019
@stefanpenner
Copy link
Contributor Author

stefanpenner commented Mar 14, 2019

kill logging: heimdalljs/heimdalljs@baac3fe

* 43d1ac2 - (HEAD -> v0.2.x, tag: v0.2.6) release v0.2.6 🎉 (7 months ago) <Stefan Penner>
* 28cbde3 - (origin/v2-kill-bad-logging) remove non-actionable logging (7 months ago) <Stefan Penner>

@stefanpenner stefanpenner deleted the wait-to-cleanup branch March 14, 2019 14:40
@stefanpenner
Copy link
Contributor Author

released as v2.2.0 🎉

@oligriffiths
Copy link
Contributor

👏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants