Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Create missing security groups when authorizing #189

Closed
fractaloop opened this Issue Oct 9, 2012 · 8 comments

Comments

Projects
None yet
4 participants
Contributor

fractaloop commented Oct 9, 2012

The control-nfs is typically the first machine to start when launching a cluster.

facet :nfs do
  role                :nfs_server
  cloud(:ec2).security_group(:nfs_server).authorize_group :nfs_client
end

During the launch process, Ironfan attempts to authorize nfs-client to the nfs_server group. Since no security group nfs_client exists, the authorization fails and the launching process halts.

This can be avoided by manually creating the nfs_client security group on the AWS console.

Should #authorize_group create the group if it does not exist? While this could lead to spurious groups if there is a typo, this seems like pretty boilerplate stuff.

Is there a preferred way to proceed on this issue?

Edit: Converting to proposal for automatic creation of missing groups.

Contributor

temujin9 commented Oct 9, 2012

Creating the group is better; it's better to make a mess when told to incorrectly, than to fail for lack of an easily made resource.

We can implement something that helps clean an environment of stale groups, etc. as a separate command under knife cluster, when we start to need such a tool.

Owner

mrflip commented Oct 10, 2012

I think authorize group does try to create a group; I think it's creating
the wrong one.

I am guessing that (due to historical reasons) we have all of:

  • parts that refer to the nfs-client with a dash- group
  • parts that refer to the nfs_client with an underbar_ group
  • if I recall, one method somewhere that sub's dashes with underbars
    somewhere in the code (and uses sub, not gsub), which I don't think we used
    to do.

Some intrepid soul should figure out what is being done, then propose what
should be done -- being as backwards-compatible as reasonable, but going
forward with convention. I'd also look closer at the character replacement
being done.

flip

On Tue, Oct 9, 2012 at 12:17 PM, Logan Lowell notifications@github.comwrote:

The control-nfs is typically the first machine to start when launching a
cluster.

facet :nfs do
role :nfs_server
cloud(:ec2).security_group(:nfs_server).authorize_group :nfs_clientend

During the launch process, Ironfan attempts to authorize nfs-client to
the nfs_server group. Since no security group nfs_client exists, the
authorization fails and the launching process halts.

This can be avoided by manually creating the nfs_client security group on
the AWS console.

Should #authorize_group create the group if it does not exist? While this
could lead to spurious groups if there is a typo, this seems like pretty
boilerplate stuff.

Is there a preferred way to proceed on this issue?


Reply to this email directly or view it on GitHubhttps://github.com/infochimps-labs/ironfan/issues/189.

infochimps.com - discover data

Contributor

nickmarden commented Nov 7, 2012

I think that this relates to a larger issue that I'd like to see fixed, specifically that we delay creation of security groups until individual server discovery. In a cluster with many servers, you wind up with a race condition in which numerous Threads have been spawned off and are racing each other to create a missing security group. This causes intermittent errors, either because of race conditions for access to the @@known security groups, or because AWS gets grumpy and rate-limits your API calls.

A better approach, IMO, would be to have a pre-individual-server-creation phase that is analogous to the post-server-creation phase that I've called 'aggregation'. This pre-individual-server-creation phase would construct any necessary and missing security groups.

Contributor

nickmarden commented Nov 7, 2012

When I say "that I'd like to see fixed", I mean "I'm planning to fix it" :-)

Contributor

temujin9 commented Dec 12, 2012

@nickmarden - any progress here, or should I try getting a fix going here?

Contributor

nickmarden commented Dec 13, 2012

Between you and @meekmichael, I've been shamed into starting this. I'll let you know how it goes.

Contributor

nickmarden commented Dec 13, 2012

OK, it looks as though this is doable. First, @mrflip's conjecture that the groups are getting auto-created seems to be the result of remembering old code, dropping acid, or a deeper understand of the code than I have. Second, there is a little bit of sophistication in getting Ironfan::Provider::Ec2::SecurityGroup.save! to back-propagate the determination of the needed security groups into the create!-equivalent code (now called prepare!) prior to the iteration over the authorization guarantee code.

I think I can get this done in about one more day.

No comments about the - vs. _ bit. I'm going to pretend that it was part of @mrflip's acid trip. Probably the brown acid?

Contributor

temujin9 commented Dec 13, 2012

The save! step does create groups but only for the group stated, not the group targeted. The emdash vs underscore bit probably comes from misremembering a route53 hack (or maybe something that was going on with roles): I don't think its relevant here.

Good luck, and godspeed.

@temujin9 temujin9 closed this in 78c2812 Dec 14, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment