fix (VPC-only?) bug when ssh permissions conflict with default open-ssh-to-world step of cluster setup #356

wants to merge 1 commit into


None yet
2 participants

ypwais commented Jan 10, 2014

I recently got a new AWS account which is now tied to VPC services. (As Amazon notes, all new accounts going forward will only be able to use VPC and not EC2 Classic). For whatever reason, my use of security group permissions causes starcluster to crash using this new account.

My config contains a rule to restrict SSH to a specific cidr:
[permission ssh]
TO_PORT = 22

When I start a cluster with this config and the new VPC-only account, I get an error about duplicate permissions:

!!! ERROR - InvalidPermission.Duplicate: the specified rule "peer:, TCP, from port: 22, to port: 22, ALLOW" already exists

I believe the error is because starcluster opens SSH to the world by default with creating a security group, and for whatever reason my new VPC-only account errors out because my ssh permission spec overlaps with the rule that starcluster sets up by default. I've tried monkeying with my default security group in the EC2 console but that doesn't seem to help.

What works is making starcluster simply not open SSH to the world if there are permissions for ssh set up later. The below patch seems to fix my bug, though I note that _add_permissions_to_sg() seems to do something similar to revoke world ssh.... let me know if you see a cleaner way to fix this bug and I'll change my patch.


ypwais commented Jan 10, 2014

Oops, it turns out my CIDR was broken-- I didn't mean to use a range of /0 . I guess EC2 considers a /0 cidr to mean "close to all ips" which conflicts with the "open to world" rule in an unexpected way.

Please consider this pull request rescinded for now.

@ypwais ypwais closed this Jan 10, 2014

@jtriley jtriley reopened this Feb 7, 2014


jtriley commented Feb 7, 2014

This happens because StarCluster creates the group, applies default SSH permissions, and then returns the original security group object without refetching it which means the object is missing the latest applied rules/grants. I'm working on a patch now that should fix this.


ypwais commented Feb 7, 2014

Ya, I did notice something wonky but couldn't get a solid repro and things worked after I fixed my CIDR. Thanks for attacking this!


jtriley commented Feb 7, 2014

@ypwais Yea this confused me at first because StarCluster specifically handles the case of the user wanting to customize the SSH permissions. That code wasn't being invoked properly though due to the missing security group rules. Turns out the fix is relatively simple. Will merge it soon...

@jtriley jtriley closed this in 4ee8f76 Feb 11, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment