Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New resource: aws_network_interface_sg_attachment #860

Merged

Conversation

vancluever
Copy link
Contributor

This is a transfer of work from hashicorp/terraform#15167.

This adds the aws_security_group_attachment resource, allowing one to
attach security groups to ENIs outside of an aws_instance or
aws_network_interface resource.

Use cases for this would include more granular management of security
groups, or attachment of security groups to instances that are managed
out-of-band from Terraform.

This is a transfer of work from hashicorp/terraform#15167.

This adds the aws_security_group_attachment resource, allowing one to
attach security groups to ENIs outside of an aws_instance or
aws_network_interface resource.

Use cases for this would include more granular management of security
groups, or attachment of security groups to instances that are managed
out-of-band from Terraform.
Copy link
Member

@radeksimko radeksimko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for supporting such resource.

Besides the comments I left there:

Is there a reason to (or not to) support management of multiple SGs, i.e. security_group_ids instead of security_group_id? I'm not suggesting either way - genuine question - as I'm curious if you thought about that. 😃

Do I understand correctly this resource won't work in EC2 Classic?

I'd think about alternative name that can fit better in the context of other resources which accept SGs too (RDS instance, Lambda, ELB, ALB, ElastiCache, EFS MT, RedShift, LaunchConfiguration).

How about aws_network_interface_sg_attachment? I know we tend to stick to full names, but aws_network_interface_security_group_attachment seems way too long and the ENI is more important (than SG) in the overall context IMO.

and the [`aws_network_interface`][2] resources. Using this resource in
conjunction with security groups provided in-line in these resources will cause
conflicts, and will lead to spurious diffs and undefined behavior - please use
one or the other.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

func delSGFromENI(conn *ec2.EC2, sgID string, iface *ec2.NetworkInterface) error {
old := iface.Groups
var new []*string
for _, v := range iface.Groups {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 Isn't it dangerous to delete groups which we do not manage in this context by default? I don't mind having force_delete bool field which enables such behaviour but I think this is kind of breaking a promise to never touch infrastructure it didn't create or doesn't manage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, let me see if there's a safer way to do this. The main issue is that I don't think ModifyNetworkInterfaceAttribute allows you to individually add/remove security groups, but that's something I need to double check. So to effectively remove the security group we need to loop over the groups, skip the group we are removing, and build a set off the rest.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, that's a bit annoying (the API limitation) 😢 Let us know if you find a better way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't find a better way I think we'll need to use mutex to avoid modifying the same ENI from different resources in parallel.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack - good catch 😨

Do we have a provider-level mutex for this kind of thing?

Copy link
Contributor Author

@vancluever vancluever Jun 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like mutexkv is my answer. :) Will roll this in if there's no other way (I did check the SDK docs though and could not see anything more granular).

EDIT: Is the global awsMutexKV safe to use?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The safety just depends on the name of the mutex you pick - if we assume ENI IDs are globally unique (across regions) then that should be sufficiently safe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All done adding in the locks and yeah it does look like it was racing, but not anymore! 😀


func attachSecurityGroupToInterface(d *schema.ResourceData, meta interface{}) error {
sgID := d.Get("security_group_id").(string)
interfaceID := d.Get("network_interface_id").(string)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we rely on this field, shouldn't it be Required in the schema then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated - this was like this because this resource supported both ENIs and actual instances at one point in time.

return baseConfig + fmt.Sprintf(optionalConfig, externalResPre, externalDataPre, externalAttrPre)
}
return baseConfig
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me a while to understand the logic here (+ imagine what exactly is the sprintf going to do) and what's the reason for the condition.

It may be just me, but I personally think for a test case this logic is overly complicated - having two, nearly identical tests with similar configs would IMO work better. Duplication isn't always bad. 😉

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And after all the two use cases are sufficiently different (data source vs resource) that we should test them separately anyway, I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

De-coupled the configuration generation functions for the most part... I still parameterize on enabling the attachment so that we can test removal on the second step, but things should hopefully be a lot more clear now. Let me know if it isn't though. 🙂

@radeksimko radeksimko added the waiting-response Maintainers are waiting on response from community or contributor. label Jun 25, 2017
@vancluever
Copy link
Contributor Author

Hey @radeksimko!

I'll carry out the updates/changes when I have a chance here - just wanted to address the mainline comment about singular vs multiple security groups:

I think my opinion on that is that I prefer the singular form better, on part of that in light of things like count existing, we should favour granularity whenever we can. It's not too much for the user to really understand in the grand scheme of things (and examples using count can be documented as well).

The only thing I could think of when writing this that could be a problem would be graph sprawl, but that's kind of what automation is for - and also the one-to-one mapping might be easier to read for that matter.

The biggest win is I think much less complicated code - the code only has to support the simplest use case, which ultimately should mean safer code in the long run.

The requested changes make sense - just need to check if there's any way we can make the SG modification step safer - but otherwise everything else seems straightforward. Will update soon!

@radeksimko
Copy link
Member

think my opinion on that is that I prefer the singular form better, on part of that in light of things like count existing

That makes total sense 👌 Looks like you thought that through, good. 👍 😃

Chris Marchesi added 4 commits June 25, 2017 20:37
Renamed as pre review comments in hashicorp#860. This should help differentiate
it between the other kinds of security groups available in the AWS
provider.
This attribute was set to optional back when this resource allowed
either an instance or network interface specified. Now that this is no
longer the case, there's no reason to keep it this way.
Make the test configs a bit easier to understand. Each case (via
resource or data source) now has its own config, but we still
parameterize on enabling/disabling the security group resource for the
removal check.
The resource was actually racing when there was multiple attachments
trying to work with the same network interface. This is fixed now with
locks added in create and delete.

The added test checks the race in a couple of steps, switching up the
resource names on the second run for the security groups and security
group attachments to get a good mix of creation and deletion events to
really test the effectiveness of the serialization.

Also a small cosmetic re-refactoring of test names and configuration
generation functions.
@vancluever vancluever changed the title New resource: aws_security_group_attachment New resource: aws_network_interface_sg_attachment Jun 26, 2017
@vancluever
Copy link
Contributor Author

Alright, all updated! Definitely good we caught the locking too as it was for sure racing. I put in a good test for that I think that should generate enough concurrency to ensure that the locking is working.

Anything else that needs fixing let me know 🙂

@radeksimko radeksimko removed the waiting-response Maintainers are waiting on response from community or contributor. label Jun 26, 2017
Copy link
Member

@radeksimko radeksimko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now looking functionally good 👍

Besides my comments in code I think it would be good if the interfaces of functions referenced from inside the CRUD weren't so greedy 😄

e.g. attachSecurityGroupToInterface surely doesn't need the whole schema and certainly not all metadata.

Let me know what you think.

resource.TestStep{
Config: testAccAwsNetworkInterfaceSGAttachmentRaceCheckConfig("step2"),
Check: checkSecurityGroupAttachmentRace("step2"),
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we're actually testing the scenario of multiple attachments at the same time here. 🤔
I think we just create & destroy and create & destroy another set afterwards.

Wouldn't it be easier to just have a single config with 4 aws_network_interface_sg_attachment referencing the same network_interface_id? Then Terraform will create all 4 resources at the same time by default without us having to explicitly do something in the test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Step 2 should basically destroy all aws_network_interface_sg_attachment_step1.* instances and create aws_network_interface_sg_attachment_step2.*. This should have the effect of sufficient chaos as 4 detachment and 4 attachment operations are going on at the same time in step 2 (4 resources being destroyed and 4 being created).

I think what might need work is aws_security_group.sg_step1|step2 - I don't know if maintaining these separately is getting the desired above effect and maybe a central group of SGs should be created instead.

}
}

func checkSecurityGroupAttachment(checkPrimaryInterfaceAttr bool, expected bool) resource.TestCheckFunc {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a nitpick, but where possible string arguments are usually better in readability than boolean ones as it's more obvious how do they affect the behaviour without having to read the body of the function (related to checkPrimaryInterfaceAttr).

func resourceAwsNetworkInterfaceSGAttachmentCreate(d *schema.ResourceData, meta interface{}) error {
mk := "network_interface_sg_attachment_" + d.Get("network_interface_id").(string)
awsMutexKV.Lock(mk)
defer awsMutexKV.Unlock(mk)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if the mutexes were inside attachSecurityGroupToInterface & detachSecurityGroupFromInterface so we could encapsulate the logic of detaching/attaching SG and avoid forgetting mutexes when these functions get called elsewhere in the codebase.

That said I understand you probably put them here to keep it locked until we read the data back in resourceAwsNetworkInterfaceSGAttachmentRead below, which is reasonable due to eventual consistency issues, so I guess you can ignore me. 🤷‍♂️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, these are on the outer CRUD functions because I really don't want any dependent read operations getting messed up by concurrency (there are possibly ones in the create/delete operations too).

With that said, I think the overly functional nature of this code in its current form (again a relic of when I had this supporting instances too), so I think on the next refactor run I will probably just flatten most of this functionality to where it makes sense (at the very least, on attachSecurityGroupToInterface and detachSecurityGroupFromInterface).

NetworkInterfaceIds: aws.StringSlice([]string{interfaceID}),
}

dniResp, err := conn.DescribeNetworkInterfaces(dniParams)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason we can't use fetchNetworkInterface here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't and I don't know why I missed this - will update 👍

@radeksimko radeksimko added waiting-response Maintainers are waiting on response from community or contributor. and removed waiting-response Maintainers are waiting on response from community or contributor. labels Jun 27, 2017
Simplified the interface check function so that the test case directly
takes the attribute that we are deriving the interface ID from, rather
than taking a bool. This actually uncovered the fact my attribute logic
was reversed (the bool logic was giving primary_network_interface_id for
a false value passed to checkPrimaryInterfaceAttr instead of a true
value, and this error was propagated to the test cases). So this is
fixed now as well.
@vancluever
Copy link
Contributor Author

Still working on the re-factor here, will knock out the rest tomorrow morning 👍

Chris Marchesi added 2 commits June 28, 2017 07:56
Race check needed simplifying as well, in addition to being reduced from
two steps to one.

Reason for the latter is once security groups were modified so that the
were operating off the same set of groups in both steps (so step1 ->
step2), it was discovered that there was no way we could reasonably
expect the deletion/creation order would never favour a situation where
the new SG attachment would be ordered after the old SG attachment was
removed (as both step1 destroys and step2 creations would be happening
at the same level in the graph and without any dependencies).
Removed a bunch of the single-use functions, which has moved most logic
to the main CRUD functions. The old functions served a purpose when this
resource was designed to support both instance IDs and network interface
IDs, but just adds more cruft now that only network interface IDs are
supported.

Also moved all the messages to the DEBUG level as TF_LOG=info does not
mean anything at a provider level, currently.
@vancluever
Copy link
Contributor Author

Alright, last little bit added - should address all of the new review items

Let me know if there's more 😉

Copy link
Member

@radeksimko radeksimko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just left you some nitpicky comments there - really low importance.

This now looks good to me - merge away! 🎉 👍

func resourceAwsNetworkInterfaceSGAttachmentDelete(d *schema.ResourceData, meta interface{}) error {
// Get a lock to prevent races on other SG attachments/detatchments on this
// interface ID. This lock is released when the function exits, regardless of
// success or failure.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think when the lock is released is obvious from the code below, so the second sentence seems a bit redundant.

}

func resourceAwsNetworkInterfaceSGAttachmentCreate(d *schema.ResourceData, meta interface{}) error {
// Get a lock to prevent races on other SG attachments/detatchments on this
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

detatchments -> detachments

func resourceAwsNetworkInterfaceSGAttachmentCreate(d *schema.ResourceData, meta interface{}) error {
// Get a lock to prevent races on other SG attachments/detatchments on this
// interface ID. This lock is released when the function exits, regardless of
// success or failure.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think when the lock is released is obvious from the code below, so the second sentence seems a bit redundant. Why (the 1st sentence) is more important then what.

}

func resourceAwsNetworkInterfaceSGAttachmentDelete(d *schema.ResourceData, meta interface{}) error {
// Get a lock to prevent races on other SG attachments/detatchments on this
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

detatchments -> detachments

@radeksimko
Copy link
Member

I forgot you don't have merge perms 🤦‍♂️ merging now!

@radeksimko radeksimko merged commit a19c813 into hashicorp:master Jun 29, 2017
@vancluever
Copy link
Contributor Author

Haha, no prob... and I was about to remove those comments too 😜

I'll put in another PR!

@ghost
Copy link

ghost commented Apr 12, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

@ghost ghost locked and limited conversation to collaborators Apr 12, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
new-resource Introduces a new resource.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants