Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel builds with similar apps and deps can cause crashes #124

Open
workmanw opened this Issue Aug 8, 2017 · 12 comments

Comments

Projects
None yet
5 participants
@workmanw
Copy link

workmanw commented Aug 8, 2017

With @rwjblue's help, I discovered that building two of my apps in parallel can cause a crash related to broccoli-persistent-filter. Here is an example stack:

Build failed.
The Broccoli Plugin: [BroccoliMergeTrees: TreeMerger (appAndDependencies)] failed with:
Error: unexpected end of file
    at Zlib._handle.onerror (zlib.js:370:17)

The broccoli plugin was instantiated at:
    at BroccoliMergeTrees.Plugin (/mnt/jenkins-workspaces/batterii-horizons-deploy/src/buttercup-epcot/apps/bc-admin/node_modules/broccoli-plugin/index.js:7:31)
    at new BroccoliMergeTrees (/mnt/jenkins-workspaces/batterii-horizons-deploy/src/buttercup-epcot/apps/bc-admin/node_modules/ember-cli/node_modules/broccoli-merge-trees/index.js:16:10)
    at Function.BroccoliMergeTrees [as _upstreamMergeTrees] (/mnt/jenkins-workspaces/batterii-horizons-deploy/src/buttercup-epcot/apps/bc-admin/node_modules/ember-cli/node_modules/broccoli-merge-trees/index.js:10:53)
    at mergeTrees (/mnt/jenkins-workspaces/batterii-horizons-deploy/src/buttercup-epcot/apps/bc-admin/node_modules/ember-cli/lib/broccoli/merge-trees.js:85:33)
    at EmberApp._mergeTrees (/mnt/jenkins-workspaces/batterii-horizons-deploy/src/buttercup-epcot/apps/bc-admin/node_modules/ember-cli/lib/broccoli/ember-app.js:1815:12)
    at EmberApp.appAndDependencies (/mnt/jenkins-workspaces/batterii-horizons-deploy/src/buttercup-epcot/apps/bc-admin/node_modules/ember-cli/lib/broccoli/ember-app.js:1135:17)
    at EmberApp.javascript (/mnt/jenkins-workspaces/batterii-horizons-deploy/src/buttercup-epcot/apps/bc-admin/node_modules/ember-cli/lib/broccoli/ember-app.js:1235:30)
    at EmberApp.toArray (/mnt/jenkins-workspaces/batterii-horizons-deploy/src/buttercup-epcot/apps/bc-admin/node_modules/ember-cli/lib/broccoli/ember-app.js:1672:12)
    at EmberApp.toTree (/mnt/jenkins-workspaces/batterii-horizons-deploy/src/buttercup-epcot/apps/bc-admin/node_modules/ember-cli/lib/broccoli/ember-app.js:1694:38)
    at module.exports (/mnt/jenkins-workspaces/batterii-horizons-deploy/src/buttercup-epcot/apps/bc-admin/ember-cli-build.js:97:14)

We determined the cause was related to stefanpenner/async-disk-caches use of the system temp directory. The two applications are separate apps (in separate directories), but they are similar in size and almost identical in dependencies. The builds kick off at the same time in parallel. Both builds end up using the same system tmp cache and one clobbers the other.

A workaround is to set an env variable that causes it to not use the system cache:

BROCCOLI_PERSISTENT_FILTER_CACHE_ROOT=./cache ember build -prod
@rwjblue

This comment has been minimized.

Copy link
Collaborator

rwjblue commented Aug 8, 2017

Thank you for reporting! I believe we were aware (generally speaking) that a race could occur here. We should try add in some more resiliency to ensure that cross-process caching does not fail the build. I believe that in the worst case scenario, we should just opt out of caching and recalculate.

@stefanpenner - Thoughts?

@workmanw

This comment has been minimized.

Copy link
Author

workmanw commented Aug 8, 2017

One other relevant note is that I realized that in my case it's also possible for the exact same app [w/ same deps] to be building multiple times in parallel in different folders. Basically if the same commit ends up on multiple branches that are being tracked by CI, it could end up building them both at the same time.

@stefanpenner

This comment has been minimized.

Copy link
Owner

stefanpenner commented Aug 8, 2017

building the same project (exact some on-disk files/deps etc) more then once concurrently isn't really something we support. At-least not with a persistent cache... It is possible to support this, but It would come at either a complexity and/or performance cost to the regular case.

It might be a good idea to have your CI box limit its concurrency somewhat, or concurrent tasks (like building) will starve each other for resources...


I don't think this is an issue with this library, but we should make clear reference to it in the readme.

If we want to mitigate this issue, we may want to an a locking mechanism to ember-cli itself, essentially disallowing the same project to be actively build concurrently.


@workmanw your above "Work-around" actually seems like the exact "configuration" I would expect for supporting concurrent builds. Alternatively, we could also enable an "opt-out" of persistent cache altogether... as you are then essentially opting out of persistent-cache entirely on your CI box.

@workmanw

This comment has been minimized.

Copy link
Author

workmanw commented Aug 8, 2017

@stefanpenner I'm 👍 for making it an option. I know most people run their builds in a containerized service (Travis, etc) and that makes us in the minority here.


building the same project (exact some on-disk files/deps etc)

To be clear, it's the same project/deps but on a different branch in a different cloned instance of the repo (and thus in a different directory). A single GitHub post hook can trigger both builds, so they end up running simultaneously and thus fight over the system temp cache.


I do think we would get some value out of having a configuration for the cache path because in our case, we build both dev and prod versions of the app back to back. So without really knowing too many of the details, it sounds like there would be some cache benefit for that.

I'm fine to continue using BROCCOLI_PERSISTENT_FILTER_CACHE_ROOT. But it would be nice to have it documented and marked public so I don't have to worry about regressions. And I wouldn't mind a slightly shorter ENV variable 😃 .

@stefanpenner

This comment has been minimized.

Copy link
Owner

stefanpenner commented Aug 9, 2017

To be clear, it's the same project/deps but on a different branch in a different cloned instance of the repo (and thus in a different directory). A single GitHub post hook can trigger both builds, so they end up running simultaneously and thus fight over the system temp cache.

Something seems fishy then, because the cached files should have different mtime/atimes and then ultimately not collide...

I'm assuming you come to this issue via BroccoliPersistentFilter (or one of its many subclasses?) If so, then maybe we are missing including something important in the cache key..

@workmanw

This comment has been minimized.

Copy link
Author

workmanw commented Aug 9, 2017

@stefanpenner One point of interest is that it only ever seemed to happen on prod builds (w/ fingerprinting). I don't know if that's relevant or not.

I'm happy to help in any way I can. We could do a screen hero if you want. But honestly I don't want to suck up your time if I'm the only one experiencing this. And the debugging is not super pleasant because it's all over SSH. That said, I'm always happy to help.

roschaefer added a commit to roschaefer/rundfunk-mitbestimmen that referenced this issue May 5, 2018

Trying to fix flaky zlib Error
Disabling the tmp cache for `ember test` could be a solution, as we're
running two builds of the frontend at the same time in our travis ci
setup. 1. `foreman -f ProcfileTesting &` (send to the background) and 2.
`ember test`.
See
stefanpenner/broccoli-persistent-filter#124

roschaefer added a commit to roschaefer/rundfunk-mitbestimmen that referenced this issue May 5, 2018

Trying to fix flaky zlib Error
Disabling the tmp cache for `ember test` could be a solution, as we're
running two builds of the frontend at the same time in our travis ci
setup. 1. `foreman -f ProcfileTesting &` (send to the background) and 2.
`ember test`.
See
stefanpenner/broccoli-persistent-filter#124

roschaefer added a commit to roschaefer/rundfunk-mitbestimmen that referenced this issue May 5, 2018

Trying to fix flaky zlib Error
Disabling the tmp cache for `ember test` could be a solution, as we're
running two builds of the frontend at the same time in our travis ci
setup. 1. `foreman -f ProcfileTesting &` (send to the background) and 2.
`ember test`.
See
stefanpenner/broccoli-persistent-filter#124

mattwr18 added a commit to roschaefer/rundfunk-mitbestimmen that referenced this issue May 6, 2018

Trying to fix flaky zlib Error
Disabling the tmp cache for `ember test` could be a solution, as we're
running two builds of the frontend at the same time in our travis ci
setup. 1. `foreman -f ProcfileTesting &` (send to the background) and 2.
`ember test`.
See
stefanpenner/broccoli-persistent-filter#124
@xbmono

This comment has been minimized.

Copy link

xbmono commented Oct 10, 2018

when I run unit tests in my Ember app in parallel, even if they are in separate workspace so no shared folder, I get this error, I wonder if this is related:

[emberTests2] cleaning up...
[emberTests2] Build failed.
[emberTests2] Build Canceled: Broccoli Builder ran into an error with `Babel` plugin. 💥
[emberTests2] unexpected end of file
[emberTests2] Error: unexpected end of file
[emberTests2]     at Zlib._handle.onerror (zlib.js:370:17)
@stefanpenner

This comment has been minimized.

Copy link
Owner

stefanpenner commented Oct 10, 2018

@xbmono yes, we can fix this by changing the cache key to consider the full path of the workspace.

@xbmono

This comment has been minimized.

Copy link

xbmono commented Oct 10, 2018

@stefanpenner thanks it worked :)

@synaptiko

This comment has been minimized.

Copy link

synaptiko commented Jan 24, 2019

We use monorepo for our ember addons and apps and we use lerna to manage it. Until recently we were running commands in series but we wanted to make running tests faster so we experimented with lerna's parallelism but after few attempts we started having the error described here.

@xbmono yes, we can fix this by changing the cache key to consider the full path of the workspace.

I think such a fix would be good enough while keeping the advantage of persistent cache.

Would you reconsider your approach? I think there are few use-cases which would benefit from it even thought it was not originally meant to be used for concurrent runs.

Or if those are use-case you want to avoid, could you please consider adding some option which could be provided over ember-cli-build.js file instead of environment variable?

@rwjblue

This comment has been minimized.

Copy link
Collaborator

rwjblue commented Jan 24, 2019

@synaptiko

Or if those are use-case you want to avoid, could you please consider adding some option which could be provided over ember-cli-build.js file instead of environment variable?

Environment variables can easily be set in ember-cli-build.js:

process.env.SOME_AWESOME_THING = "salsa";

@xbmono yes, we can fix this by changing the cache key to consider the full path of the workspace.

I think such a fix would be good enough while keeping the advantage of persistent cache.

Would you reconsider your approach?

I'm not sure how to interpret this. Are you saying that @stefanpenner's suggestion would work or would not work? As far as I can tell his suggestion would solve the specific problem reported (amongst others), and we would gladly review/accept pull requests to implement. What do we need to reconsider?

@synaptiko

This comment has been minimized.

Copy link

synaptiko commented Jan 24, 2019

Environment variables can easily be set in ember-cli-build.js

Thanks for the tip, I wasn't aware of this method.

I'm not sure how to interpret this. Are you saying that @stefanpenner's suggestion would work or would not work?

I think that if the full path is part of the cache key then there won't be any race condition which means it would fix both CI use-case and monorepo commands running in parallel.

I wanted you to reconsider the method of the caching mechanism.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.