Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remote_file cache attribute proposal #94

Closed
wants to merge 1 commit into from
Closed

remote_file cache attribute proposal #94

wants to merge 1 commit into from

Conversation

mattray
Copy link
Contributor

@mattray mattray commented Feb 26, 2015

The remote_file Resource will add a new attribute cache requesting that the Chef Server mirror the file.

@kcbraunschweig
Copy link

I'm not sure I understand the goal here? So this is the client side of a
bookshelf feature? It says hosted chef won't use this, which to me is the
only place the phrase 'local mirror' makes sense since S3 isn't really
local to the chef server. It's sorta weird to have a resource param in the
client that just magically doesn't work if you're on hosted chef.

Assuming this is bookshelf only, then on a standalone chef server we're
just moving load from bookshelf to nginx. We've had bookshelf fall over in
the past so I'm not disagreeing that nginx can probably be made to scale
farther but is that the problem we're trying to solve? How will this make
downloads faster (vs. just supporting more nodes at the same rate)? The
race conditions endemic to this sort of caching seem a high price to pay,
not to mention potentially duplicating memory and/or disk between bookshelf
and nginx. This is a bit more compelling for tier-model chef servers if
it'd allow horizontal scaling of the nginx component on the front-ends vs.
bookshelf which is on the backend only. If that's an intended feature it
should be called out.

Is there really not a cache invalidation strategy that could mitigate the
problem? Sure it's a feature that defaults to off and if you turn it on you
should know not to shoot yourself in the foot. However it'll be really easy
to do a theoretically atomic cookbook upload which updates a file and adds
a change that depends on that update and then it breaks because of the
cache.

KC

On Wed, Feb 25, 2015 at 7:30 PM, Matt Ray notifications@github.com wrote:

The remote_file https://docs.chef.io/resources.html#remote-file
Resource will add a new attribute cache requesting that the Chef Server

mirror the file.

You can view, comment on, or merge this pull request online at:

#94
Commit Summary

  • remote_file cache attribute proposal

File Changes

Patch Links:


Reply to this email directly or view it on GitHub
#94.

@jonlives
Copy link
Contributor

@kcbraunschweig I think you might be confusing remote_file and cookbook_file here - the objective of this would be when you're downloading a remote_file from, say, http or FTP, you can have it cached on the Chef server to avoid the need to go out to the internet / other source every time.

@thommay
Copy link
Collaborator

thommay commented Feb 26, 2015

My concern is that the feature will never get turned on, because doing this at the level of an entire chef server is not granular enough.

I absolutely agree that you'd probably never want to do this for a CI environment, but for a production environment you probably would; I think you'd want to do this at a per-environment level (which opens up a whole new can of worms).

@jonlives
Copy link
Contributor

I think it might also be worth clarifying that the proxy mechanism on the server would ideally check for (and respect) cache headers, if-modified-since etc etc on the remote end. At the moment the RFC just specifies a configurable duration for cached content.

@kcbraunschweig
Copy link

Jon - my bad, I was confused. That is a terrifying use case for a
production environment though...

On Thu, Feb 26, 2015 at 2:41 AM, Jon Cowie notifications@github.com wrote:

I think it might also be worth clarifying that the proxy mechanism on the
server would ideally check for (and respect) cache headers,
if-modified-since etc etc on the remote end. At the moment the RFC just
specifies a configurable duration for cached content.


Reply to this email directly or view it on GitHub
#94 (comment).

@mattray
Copy link
Contributor Author

mattray commented Feb 27, 2015

My goal with this proposed RFC was to see if others have similar issues with remote_file content and to see if the Chef server is an option for mirroring. Since the bandwidth in my lab is spotty and I've written cookbooks for apt-cacher, squid and bittorrent in the past, I naively thought other folks might want to leverage the Chef server this way. I hadn't considered leveraging Bookshelf or any of Chef's access controls, I just wanted a transparent proxying cache.

Clearly @kcbraunschweig's use case is very different from mine and I didn't spend much time on the details, I can support myself if people don't think it's a useful suggestion for a wider userbase that warrants greater investigation.

@coderanger
Copy link
Contributor

Security concerns seem to outweigh any benefit. With this enabled, any node could trivially DoS the Chef Server by requesting too many/too big caches. Having a high-quality (drop in?) resource for something like Herd or Murder seems like an even better solution to the problem without most of the downsides.

@mattray
Copy link
Contributor Author

mattray commented Mar 4, 2015

Withdrawn.

@mattray mattray closed this Mar 4, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

6 participants