-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement 1/3rds of output properties #186
Conversation
This change includes approximately 1/3rd of the change necessary to support output properties, as per #90. In short, the runtime now has a new hidden type, Latent<T>, which represents a "speculative" value, whose eventual type will be T, that we can use during evaluation in various ways. Namely, operations against Latent<T>s generally produce new Latent<U>s. During planning, any Latent<T>s that end up in resource properties are transformed into "unknown" property values. An unknown property value is legal only during planning-time activities, such as Check, Name, and InspectChange. As a result, those RPC interfaces have been updated to include lookaside maps indicating which properties have unknown values. My intent is to add some helper functions to make dealing with this circumstance more correct-by-construction. For now, using an unresolved Latent<T> in a conditional will lead to an error. See #67. Speculating beyond these -- by supporting iterative planning and application -- is something we want to support eventually, but it makes sense to do that as an additive change beyond this initial support. That is a missing 1/3. Finally, the other missing 1/3rd which will happen much sooner than the rest is restructuing plan application so that it will correctly observe resolution of Latent<T> values. Right now, the evaluation happens in one single pass, prior to the application, and so Latent<T>s never actually get witnessed in a resolved state.
We need to smuggle metadata from the resource IDL all the way through to the runtime, so that it knows which things are output properties. In order to do this, we'll leverage decorators and the support for serializing them as attributes. This change adds support for the various kinds (class, property, method, and parameter), in addition to test cases.
This change pretty-prints attribute metadata in `lumi pack info`. For example: package "basic/decorators" { dependencies [] module "index" { exports [] method ".main": () class "TestDecorators" [@basic/decorators:index:classDecorate] { property "a" [public, @basic/decorators:index:propertyDecorate]: string method "m1" [public, @basic/decorators:index:methodDecorate]: (): string } } } It also includes support for printing property getters/setters: property "p1" [public]: string { method "get" [public, @basic/decorators:index:methodDecorate]: (): string method "set" [public]: (v: string) }
This change adds a @lumi.out decorator and modifies LumIDL to emit it on output properties of resource types. There still isn't any runtime awareness, however, this is an isolated change that will facilitate it.
This change introduces the notion of a computed versus an output property on resources. Technically, output is a subset of computed, however it is a special kind that we want to treat differently during the evaluation of a deployment plan. Specifically: * An output property is any property that is populated by the resource provider, not code running in the Lumi type system. Because these values aren't available during planning -- since we have not yet performed the deployment operations -- they will be latent values in our runtime and generally missing at the time of a plan. This is no problem and we just want to avoid marshaling them in inopportune places. * A computed property, on the other hand, is a different beast altogehter. Although true one of these is missing a value -- by virtue of the fact that they too are latent values, bottoming out in some manner on an output property -- they will appear in serializable input positions. Not only must we treat them differently during the RPC handshake and in the resource providers, but we also want to guarantee they are gone by the time we perform any CRUD operations on a resource. They are purely a planning-time-only construct.
This change prepares for integrating more planning and deployment logic closer to the runtime itself. For historical reasons, we ended up with these in the env.go file which really has nothing to do with deployments anymore.
This change modifies the existing resource provider RPC interface slightly. Instead of the Create API returning the bag of output properties, we will rely on the Get API to do so. As a result, this change takes an initial whack at implementing Get on all existing AWS resources. The Get API needs to return a fully populated structure containing all inputs and outputs. Believe it or not, this is actually part of #90. This was done because just returning output properties is insufficient. Any input properties that weren't supplied may have default values, for example, and it is wholly reasonable to expect Lumi scripts to depend on those values in addition to output values. This isn't fully functional in its current form, because doing this change turned up many other related changes required to enable output properties. For instance, at the moment resource properties are defined in terms of `resource.URN`s, and yet unfortunately the provider side knows nothing of URNs (instead preferring to deal in `resource.ID`s). I am going to handle that in a subsequent isolated change, since it will have far-reaching implications beyond just modifying create and get.
This change overhauls our AWS resource provder to use ARNs for all AWS resource IDs. We have gone backwards and forwards on whether to use the name, ID, or ARN for this purpose. This confusion was largely driven by the inconsistencies within the AWS APIs themselves: sometimes a resource's name is the preferred identifier, sometimes its ID, sometimes its ARN, and there are even cases where it differs based on context (e.g., nondefault versus default VPC). Thankfully, ARNs are perfectly consistent, and well-defined, on this matter. See http://docs.aws.amazon.com/general/latest/gr/aws-arns-and-namespaces.html. Although any given API may request an ID or name that isn't part of the ARN, the ARN is always sufficiently complete and lossless to enable recovery of the information needed. And furthermore, even if an API doesn't want the full ARN, the resource name component of the resource's ARN is easily parseable and useable in this role. To facilitate this, I've created a new `arn` package that has a number of factory and parsing helpers. Overall, some things are more verbose -- e.g., we must always translate from ARN space to names and vice versa -- however, using the ARN is clarifying and guarantees that we always have a way to get the information we need. And in general, thanks to the removal of several workarounds in the code, I think we come out ahead in general code delta-wise.
The change to flow logging to plugins is nice, however, it can be annoying because all writes to stderr are interepreted on the Lumi side as errors. After this change, we will only flow if --logflow is passed, e.g. as in $ lumi --logtostderr --logflow -v=9 deploy ...
This change skips printing output<T> properties as we perform a deployment, instead showing the real values inline after the resource has been created. (output<T> is still shown during planning, of course.)
This change makes progress on a few things with respect to properly receiving properties on the engine side, coming from the provider side, of the RPC boundary. The issues here are twofold: 1. Properties need to get unmapped using a JSON-tag-sensitive marshaler, so that they are cased properly, etc. For that, we have a new mapper.Unmap function (which is ultra lame -- see #138). 2. We have the reverse problem with respect to resource IDs: on the send side, we must translate from URNs (which the engine knows about) and provider IDs (which the provider knows about); similarly, then, on the receive side, we must translate from provider IDs back into URNs. As a result of these getting fixed, we can now properly marshal the resulting properties back into the resource object during the plan execution, alongside propagating and memoizing its ID.
* The EC2 instance get function needs to return security group ARNs, not raw IDs. * The EC2 security group rule CRUD operations printed pointers and not the values.
This change remembers which properties were computed as outputs, or even just read back as default values, during a deployment. This information is required in the before/after comparison in order to perform an intelligent diff that doesn't flag, for example, the absence of "default" values in the after image as deletions (among other things). As I was in here, I also cleaned up the way the provider interface works, dealing with concrete resource types, making it feel a little richer and less like we're doing in-memory RPC.
This change fixes up a few things so that updates correctly deal with output properties. This involves a few things: 1) All outputs stored on the pre snapshot need to get propagated to the post snapshot during planning at various points. This ensures that the diffing logic doesn't need to be special cased everywhere, including both the Lumi and the provider sides. 2) Names are changed to "input" properties (using a new `lumi` tag option, `in`). These are properties that providers are expected to know nothing about, which we must treat with care during diffs. 3) We read back properties, via Get, after doing an Update just like we do after performing a Create. This ensures that if an update has a cascading impact on other properties, it will be detected. 4) Inspecting a change, prior to updating, must be done using the computed property set instead of the real one. This is to avoid mutating the resource objects ahead of actually applying a plan, which would be wrong and misleading.
"github.com/pulumi/lumi/pkg/resource" | ||
) | ||
|
||
// This file contains constants and factories for all sorts of AWS resource ARNs. In the fullness of time, it should |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we put these in the individual service packages instead of centralizing them here? A bit of a shame to add another place we have to touch every time we add a new AWS service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree entirely w/ the concern. In fact, I did it this way originally; however, I moved them to this central location when I realized it could be necessary for one package to use another's. E.g., maybe lambda
needs to produce or consume an iam
ARN. This seemed like it runs a high risk of creating cycles down the road.
Are you convinced? If not, I can split them up.
var attributes []dynamodb.Attribute | ||
for _, attr := range tab.AttributeDefinitions { | ||
attributes = append(attributes, dynamodb.Attribute{ | ||
Name: *attr.AttributeName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to use aws.StringValue
for this and other places where we are extracting strings from AWS API responses?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, this would be cleaner, will fix this and any others I see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this change, although to be honest, I'm conflicted about it. aws.String
is useful because you can't take the address of function returns, constant strings, and so on. aws.StringValue
seems like just a really long synonym for *
😉
There are definitely cases like aws.StringValueSlice
that are valuable, because they compress multiple lines of boilerplate into a single function call.
For now, I'll keep going with the StringValue
s, but I remain on the fence...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notably, aws.StringValue
returns ""
for nil references, so it is different than just a long synonym for *
. That said, it's not clear whether we want the case of a nil reference appearing in one of these AWS API responses to trigger a panic or to coerce to ""
.
return err | ||
} | ||
} | ||
|
||
if err := p.waitForTableState(name, true); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is not necessary - the calls to updateTable
each do this waitForTableState
, as it has to reach Ready before the next update step is performed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops! That was a complete merge fail, sorry. Removed this.
@@ -97,12 +96,30 @@ func (p *applicationProvider) Create(ctx context.Context, obj *elasticbeanstalk. | |||
if err != nil { | |||
return "", err | |||
} | |||
return resource.ID(name), nil | |||
return arn.NewElasticBeanstalkApplicationID(p.ctx.Region(), p.ctx.AccountID(), name), nil | |||
} | |||
|
|||
// Read reads the instance state identified by ID, returning a populated resource object, or an error if not found. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment refers to Read
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch; I found a few others too (elasticbeanstalk/applicationVersion.go, elasticbeanstalk/environment.go, and lambda/function.go), and fixed those up.
} else if len(resp.Applications) == 0 { | ||
return nil, nil | ||
} | ||
contract.Assert(len(resp.Applications) == 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be an error instead of an assert?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we probably ought not to be asserting on any outputs from AWS APIs, just to be safe. It at least guards against our own misunderstanding of the APIs but they are fragile, inconsistent, and the risk seems really high that behavior will be incorrect.
This change responds to some great feedback from @lukehoban on #186. Namely: * Relax many assertions in the providers. A failure here is pretty bad because it takes down the entire provider due to fail-fast. That said, I'd rather fail in response to a bug than let it go silently. Nevertheless, a few of them were candidates for dynamic failures. * Use `aws.StringValue` and friends in more places. * Remove a superfluous waitForTableState in DynamoDB. * Fix some erroneous references to `Read` in some `Get` comments.
This PR accomplishes 1/3rds of the output properties feature (see #90). 1/3rd intentionally left out for the time being is support for conditionals that are dependent on outputs (see #170). The other 1/3rd will be done after integrating this portion as it has become increasingly painful to keep in a branch due to the RPC and provider changes.