Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AWS EC2 instances #777

Merged
merged 55 commits into from
Oct 5, 2021

Conversation

tnthornton
Copy link
Contributor

@tnthornton tnthornton commented Jul 22, 2021

Description of your changes

Add ec2/instance

Fixes #85

I have:

  • Read and followed Crossplane's contribution process.
  • Run make reviewable test to ensure this PR is ready for review.

How has this code been tested

basic create/delete so far

## create
[add-ec2-instances][~/codez/provider-aws]$ kubectl apply -f examples/ec2/instance.yaml
instance.ec2.aws.crossplane.io/sample-instance36 created

## list
[add-ec2-instances][~/codez/provider-aws]$ kubectl get instance sample-instance36
NAME                READY   SYNCED   INSTANCES   RUNNING   AGE
sample-instance36   True    True     2           2         37m
[add-ec2-instances][~/codez/provider-aws]$ kubectl get instance sample-instance36 -o wide
NAME                READY   SYNCED   INSTANCES   RUNNING   AGE   ID                  PENDING   SHUTTINGDOWN   STOPPED   STOPPING   TERMINATED
sample-instance36   True    True     2           2         37m   sample-instance36   0         0              0         0          0

## delete
[add-ec2-instances][~/codez/provider-aws]$ kubectl delete instance sample-instance36
instance.ec2.aws.crossplane.io "sample-instance36" deleted
[add-ec2-instances][~/codez/provider-aws]$ kubectl get instance
No resources found
[add-ec2-instances][~/codez/provider-aws]$ kubectl get instance sample-instance36
Error from server (NotFound): instances.ec2.aws.crossplane.io "sample-instance36" not found

@tnthornton tnthornton marked this pull request as draft July 22, 2021 17:40
@tnthornton
Copy link
Contributor Author

This is very much still a WIP. FWIW, I started from the vpc resource which is where you'll see a number references to vpc in the comments.

@tnthornton
Copy link
Contributor Author

tnthornton commented Jul 26, 2021

Alrighty, I think I got to a point where it would be good to understand what the community's desired outcome is.

Background

As @muvaf called out in #85 (comment), the AWS SDK API defines a RunInstances API for creating/running EC2 Instances, which means we can't generate most of the code through the ACK Generation Tool. 🤷

While manually implementing the API, most of it seems pretty straightforward until you get to maxCount and minCount (https://github.com/crossplane/provider-aws/pull/777/files#diff-4c32c59577ee9db86c31a77678689c5bc9128ae2a875b514ec393f12a67ffd62R197). These properties allow you to define the max/min number of instances to launch. In practice that's really no big deal; however given we're trying to externally manage these resources the behavior ends up being something we probably need to think about how to handle.

The behavior that I'm seeing is that each instance ends up with its own instanceID. It looks like in practice we use the resource ID we get back from AWS to keep track of the resource (and subsequently add it to the resource as the external-name).
Note there are other details about the individual instances that are different as well, for example IPv4 addresses, etc.

Problem

It's probably easiest to outline the issue in pictures.
Given a Instance definition like the following (note I added the EC2 Instance special tag to make it easier to pick out the instances for this example):

apiVersion: ec2.aws.crossplane.io/v1alpha1
kind: Instance
metadata:
  name: pr-example
spec:
  forProvider:
    region: us-east-1
    imageId: ami-0dc2d3e4c0f9ebd18
    maxCount: 4
    minCount: 1
    tags:
      - key: Name
        value: pr-example  
  providerConfigRef:
    name: example

We get the following:

[add-ec2-instances][~/codez/provider-aws]$ kubectl apply -f examples/ec2/instance.yaml
instance.ec2.aws.crossplane.io/pr-example created

[add-ec2-instances][~/codez/provider-aws]$ kubectl get instance pr-example
NAME         READY   SYNCED   INSTANCES   RUNNING   AGE
pr-example   True    True     4           4         3m

Screen Shot 2021-07-26 at 12 26 09 PM

Note the different instanceIDs and different IPv4 addresses and DNS.

If we were just dealing with a handful of instances getting spun up, I think adding a slice of instanceIDs to the external-name annotation wouldn't be a big deal, however this pretty quickly gets into a state that doesn't scale well. Imagine someone asking for 100 instances, 1000 instances, 10k instances.

Solution

Currently, I've implemented grouping based on the external labels (tags) that are applied to the instances so I can observe the instances as a group and delete as a group through performing a DescribeInstanceInput that has a filter using those external labels. That all seems to work just fine.

Where I'm starting to get into behavior I'm not sure whether or not its desirable is in keeping track of what we observe about the instances. For example, should we be tracking the DNS and addresses for these instances in order to display that to the end user? If so, it seems like we can totally get into the problematic behavior I outlined above when we're talking about 100+ instances.

What are your thoughts?

cc: @negz @hasheddan @muvaf @jbw976

@tnthornton
Copy link
Contributor Author

I guess one thought is if it's not possible to remotely ask for that many resources, the above issue is probably moot.

@tnthornton
Copy link
Contributor Author

Doing some research on the side, it looks like in my example above I'm using On-Demand instances, specifying the m type (m1.small). If that is true, it looks like for the account that I deployed into (which could be different for others), the limit on the at account is:
Screen Shot 2021-07-26 at 12 55 37 PM
but, for m1.small's (at least what I have provisioned so far), the vCPUs that are allocated are only 1:
Screen Shot 2021-07-26 at 12 56 38 PM

So unless I'm chasing something off base, it looks like the given configuration could go as high as 512 before I hit the account limit (assuming those were the only instances running).

@negz
Copy link
Member

negz commented Jul 26, 2021

@tnthornton my intuition here is that this is one of the cases in which we need to 'interpret' the API a little to make it a little more declarative, while still aiming for (mostly) high fidelity. I would not expose the MinCount and MaxCount parameters under spec.forProvider (or at all) and instead always call RunInstances with a max and min of 1. This way if someone wants three instances they must declaratively create three Instance resources, not one Instance resource.

I just took a quick look at how our friends over on the Terraform project handle this, and it appears they use a fixed max/min; https://github.com/hashicorp/terraform-provider-aws/blob/e9309698/aws/resource_aws_instance.go#L739

@tnthornton
Copy link
Contributor Author

Awesome. That makes things easier.

@tnthornton
Copy link
Contributor Author

tnthornton commented Jul 26, 2021

While we're on the subject of 'interpreting' the API, there's this setting that's available that seems dangerous:

	// If you set this parameter to true, you can't terminate the instance using
	// the Amazon EC2 console, CLI, or API; otherwise, you can. To change this attribute
	// after launch, use ModifyInstanceAttribute (https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_ModifyInstanceAttribute.html).
	// Alternatively, if you set InstanceInitiatedShutdownBehavior to terminate,
	// you can terminate the instance by running the shutdown command from the instance.
	//
	// Default: false
	// +optional
	DisableAPITermination *bool `json:"disableAPITermination,omitempty"`

"Dangerous" in the sense that it could result in an unpleasant experience for the end user. Our friends over at Terraform include it https://github.com/hashicorp/terraform-provider-aws/blob/e9309698/aws/resource_aws_instance.go#L728, however arguably we have a slightly different use case and are less "fire and forget".

Thoughts?

@AaronME
Copy link
Collaborator

AaronME commented Jul 26, 2021

The MinCount/MaxCount was meant to act as a gate on the instance launches. From the docs

The minimum number of instances to launch. If you specify a minimum that is more instances than Amazon EC2 can launch in the target Availability Zone, Amazon EC2 launches no instances.

One of the use-cases would be launching a block of workers to perform a bulk data task and needing to ensure these instances all exist in the same az. Using RunInstances the deployment tool could search for an az with enough capacity to handle the processing.

If we felt like honoring the behavior, we could implement "RunInstances" as an "order for instances," with no promise to manage the instances that are created. We could capture the instancesSet that is returned when the order is fulfilled. The user could capture this InstanceSet and pass it to a TerminateInstances Resource for cleanup.

This pattern works for the AWS API, because a failed call to "RunInstances" leaves no artifacts. In crossplane there would be a number of "failed" "RunInstances" objects that require cleaning up.

As @negz already said, the Instance resource in should always be a single (min=1, max=1).

@tnthornton
Copy link
Contributor Author

@AaronME that's an interesting idea. So in my mind that would be something similar to a replicaSet.

@negz
Copy link
Member

negz commented Jul 26, 2021

I think we should stick with a single Instance for the time being - we could potentially consider a higher level set of instances as a separate concern in future if we see use cases arising.

While we're on the subject of 'interpreting' the API, there's this setting that's available that seems dangerous

If I'm following correctly, I think we can leave this one in. It seems like it's disabled by default, can be updated once set, and when enabled will prevent us from being able to delete the instance? We have a similar option exposed on the RDSInstance type per https://github.com/crossplane/provider-aws/blob/009f048/apis/database/v1beta1/rdsinstance_types.go#L291. If folks enable it, they must then disable it before they delete the resource.

@tnthornton
Copy link
Contributor Author

I think we should stick with a single Instance for the time being - we could potentially consider a higher level set of instances as a separate concern in future if we see use cases arising.

Sounds good.

If I'm following correctly, I think we can leave this one in. It seems like it's disabled by default, can be updated once set, and when enabled will prevent us from being able to delete the instance? We have a similar option exposed on the RDSInstance type per https://github.com/crossplane/provider-aws/blob/009f048/apis/database/v1beta1/rdsinstance_types.go#L291. If folks enable it, they must then disable it before they delete the resource.

Yep, spot on. I couldn't quite find a similar approach (poor searching skills apparently) which was why I raised it.

Alrighty with that all cleared up, I'll adjust the current impl to focus on just the single Instance and then cut over to support Updates so if users do enable DisableAPITermination they can subsequently disable it again.

@tnthornton tnthornton marked this pull request as ready for review July 28, 2021 22:08
@tnthornton tnthornton force-pushed the add-ec2-instances branch 2 times, most recently from b61051d to e6cf850 Compare August 1, 2021 15:12
@AaronME AaronME added the size/L label Aug 12, 2021
@tnthornton tnthornton force-pushed the add-ec2-instances branch 6 times, most recently from 898e33f to 9e58b42 Compare September 13, 2021 22:11
@AaronME AaronME self-assigned this Sep 24, 2021
@AaronME AaronME requested review from AaronME and removed request for muvaf and hasheddan September 24, 2021 17:32
Signed-off-by: Taylor Thornton <taylor@upbound.io>
Signed-off-by: Taylor Thornton <taylor@upbound.io>
Signed-off-by: Taylor Thornton <taylor@upbound.io>
Signed-off-by: Taylor Thornton <taylor@upbound.io>
Signed-off-by: Taylor Thornton <taylor@upbound.io>
Signed-off-by: Taylor Thornton <taylor@upbound.io>
Signed-off-by: Taylor Thornton <taylor@upbound.io>
Signed-off-by: Taylor Thornton <taylor@upbound.io>
add additional instance counts to api output

Signed-off-by: Taylor Thornton <taylor@upbound.io>
adds support for user defined tags
adjusts how we are grouping instances from the special 'Name' tag to using the external resource labeling

Signed-off-by: Taylor Thornton <taylor@upbound.io>
add DisableAPITermination
adjust management approach to focus on a single Instance object rather than a group

Signed-off-by: Taylor Thornton <taylor@upbound.io>
Signed-off-by: Taylor Thornton <taylor@upbound.io>
adds late-init flow

Signed-off-by: Taylor Thornton <taylor@upbound.io>
Signed-off-by: Taylor Thornton <taylor@upbound.io>
Signed-off-by: Taylor Thornton <taylor@upbound.io>
Signed-off-by: Taylor Thornton <taylor@upbound.io>
Signed-off-by: Taylor Thornton <taylor@upbound.io>
…al as they aren't guaranteed to come back

Signed-off-by: Taylor Thornton <taylor@upbound.io>
Signed-off-by: Taylor Thornton <taylor@upbound.io>
Signed-off-by: Taylor Thornton <taylor@upbound.io>
Signed-off-by: Taylor Thornton <taylor@upbound.io>
Signed-off-by: Taylor Thornton <taylor@upbound.io>
Signed-off-by: Taylor Thornton <taylor@upbound.io>
Signed-off-by: Taylor Thornton <taylor@upbound.io>
Copy link
Collaborator

@AaronME AaronME left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tnthornton update loop is gone. This is good to merge.

Thank you for the contribution!

@AaronME AaronME merged commit 76bc036 into crossplane-contrib:master Oct 5, 2021
@negz
Copy link
Member

negz commented Oct 7, 2021

Thanks @tnthornton and @AaronME for working to get this merged - I think this will be huge for our users.

tektondeploy pushed a commit to gtn3010/provider-aws that referenced this pull request Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for Amazon EC2 Instances
3 participants