Skip to content
This repository has been archived by the owner on Dec 13, 2018. It is now read-only.

Use cached layers if repository has a yarn lockfile #91

Open
rclark opened this issue Feb 9, 2017 · 16 comments
Open

Use cached layers if repository has a yarn lockfile #91

rclark opened this issue Feb 9, 2017 · 16 comments

Comments

@rclark
Copy link
Contributor

rclark commented Feb 9, 2017

Right now, images are built with the --no-cache flag. Using cached layers is a way to significantly decrease build times, and long build times are one of the biggest bummers about our current CI flow.

One of the arguments pro --no-cache is that without it, npm install with semver version identifiers in package.json could lead to images that use old cached layers for node.js dependencies. This could lead to unexpected (and very non-deterministic) mismatches between your local environment and your production environment.

Yarn's use of a lockfile that pins node.js dependency versions and is committed in the repo avoids this misstep, and makes me wonder if we could drop the --no-cache flag if there's yarn file in the repo.

However there are still a few other questions to weigh against such a decision:

  • You would only get build-time caching benefits sometimes and not all the time. This depends on whether your conex worker task lands on an EC2 that still has the cached layers from a previous build.

  • Due to ^^, you'd may want to try and keep cached layers laying around on the EC2s for longer, and this leads to disk space management problems.

It may be worth exploring this anyways, without adjusting anything about how we have our EC2s clean up old images/layers. If we can demonstrate a significant benefit for projects with hefty node.js dependency trees or huge unix package dependencies, it may be worthwhile.

cc @springmeyer @scothis @mcwhittemore @mapsam @GretaCB

@mapsam
Copy link
Member

mapsam commented Feb 9, 2017

You would only get build-time caching benefits sometimes and not all the time

Personally this doesn't feel like a reason to not do it, but I can see why it would be confusing if ecs-conex times start varying wildly.

disk space management problems

This would impact the entire EC2 right? So any task running on this EC2 could be affected, not just the service which has started caching images? Which services currently use up the most disk space?

@rclark
Copy link
Contributor Author

rclark commented Feb 9, 2017

Yes, disk space is an EC2-wide problem. And controls on disk space are really outside the scope of ecs-conex's responsibilities, except that at present, conex explicitly cleans up after itself, removing everything about the image that it just built, in an effort to reduce impact on any other services that might be sharing the host EC2s.

@mcwhittemore
Copy link

If we can demonstrate a significant benefit for projects with hefty node.js dependency trees or huge unix package dependencies, it may be worthwhile.

The move to yarn itself will help a lot with node.js dependencies so if that's a requirement and there is a downside to doing this, we might want to see what the gains from using yarn are first.

@springmeyer
Copy link

Thanks for writing this up and providing a view into the state of things.

non-deterministic builds feels like a major drawback.

If we can demonstrate a significant benefit for projects with hefty node.js dependency trees or huge unix package dependencies, it may be worthwhile.

My feeling is that we should continue to focus on trimming the trees and unix package dependencies as a way to speed things up. There are still a lot of duplicate npm packages happening in projects and we can move to using more mason package in place of apt deps to drop weight.

@mapsam
Copy link
Member

mapsam commented Feb 10, 2017

That's a good point @mcwhittemore - yarn will definitely bring in some big savings, esp around duplicate npm packages.

@mcwhittemore
Copy link

There are still a lot of duplicate npm packages happening in projects

Agreed. This is one benefit of yarn. Another is that it is deterministic unlike npm due to its yarn's lockfile.

@lukasmartinelli
Copy link

One of the arguments pro --no-cache is that without it, npm install with semver version identifiers in package.json could lead to images that use old cached layers for node.js dependencies. This could lead to unexpected (and very non-deterministic) mismatches between your local environment and your production environment.

What about being able to specify whether caching should be enabled or not in a package.json or .ecs-conex file?

A simple way for caching would be to download the image of the previous commit and then run the build without no-cache.

@rclark
Copy link
Contributor Author

rclark commented Mar 3, 2017

download the image of the previous commit and then run the build without no-cache

I believe we originally did this, and then removed it once we implemented --no-cache. Worth exploring for sure, though you'd want to some way to make sure that downloading time isn't going to be a significant penalty.

@lukasmartinelli
Copy link

IMHO it is the responsibility of the image creator that it always yields consistent results - not of the build infrastructure.

The long wait times for a image build slow down iteration time a lot.

I believe we originally did this, and then removed it once we implemented --no-cache. Worth exploring for sure, though you'd want to some way to make sure that downloading time isn't going to be a significant penalty.

In my experience fetching from NPM or apt is usually always slower than downloading the compressed filesystem layer from the internal network (even though AWS provides own mirrors).

@rclark
Copy link
Contributor Author

rclark commented Mar 16, 2017

I think that the next action here is a PR to download previous images before a build, and then gather some metrics on build durations to confirm that we're seeing a benefit. Once #94 lands, the duration of each build will be captured in CloudWatch automatically (new watchbot feature).

In terms of consistency between builds, I'm comfortable using the existence of a yarn lockfile as a queue that we should download a prior image and build off of it. I agree with the sentiment that images should handle this themselves @lukasmartinelli, but the reality of our npm usage is that we aren't there for most of our builds.

@lukasmartinelli
Copy link

🤗 I still really really want faster image builds! Give us caching platform overlords 🤗

@lukasmartinelli
Copy link

1oav1r

@rclark
Copy link
Contributor Author

rclark commented May 3, 2017

@lukasmartinelli let me just clone myself real quick. brb.

@lukasmartinelli
Copy link

@lukasmartinelli let me just clone myself real quick. brb.

Imagine if we had three clarks!! 🤗 ❤️

@lukasmartinelli
Copy link

lukasmartinelli commented Jun 22, 2017

🤗 🤗 🤗

Bump. I think this will lead to a lot more
developer productivity 🔨 + saved ⏲ => saved 💵 + more dev 😄.

Especially when iterating on CloudFormation templates.
When one edits the CloudFormation files the Docker image shouldn't need to be rebuild.

@emilymcafee
Copy link
Contributor

Per chat, this is not something that is likely to move in the next couple months.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants