Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement 1/3rds of output properties #186

Merged
merged 20 commits into from
Jun 1, 2017
Merged

Implement 1/3rds of output properties #186

merged 20 commits into from
Jun 1, 2017

Conversation

joeduffy
Copy link
Member

@joeduffy joeduffy commented Jun 1, 2017

This PR accomplishes 1/3rds of the output properties feature (see #90). 1/3rd intentionally left out for the time being is support for conditionals that are dependent on outputs (see #170). The other 1/3rd will be done after integrating this portion as it has become increasingly painful to keep in a branch due to the RPC and provider changes.

This change includes approximately 1/3rd of the change necessary
to support output properties, as per #90.

In short, the runtime now has a new hidden type, Latent<T>, which
represents a "speculative" value, whose eventual type will be T,
that we can use during evaluation in various ways.  Namely,
operations against Latent<T>s generally produce new Latent<U>s.

During planning, any Latent<T>s that end up in resource properties
are transformed into "unknown" property values.  An unknown property
value is legal only during planning-time activities, such as Check,
Name, and InspectChange.  As a result, those RPC interfaces have
been updated to include lookaside maps indicating which properties
have unknown values.  My intent is to add some helper functions to
make dealing with this circumstance more correct-by-construction.

For now, using an unresolved Latent<T> in a conditional will lead
to an error.  See #67.  Speculating beyond these -- by
supporting iterative planning and application -- is something we
want to support eventually, but it makes sense to do that as an
additive change beyond this initial support.  That is a missing 1/3.

Finally, the other missing 1/3rd which will happen much sooner
than the rest is restructuing plan application so that it will
correctly observe resolution of Latent<T> values.  Right now, the
evaluation happens in one single pass, prior to the application, and
so Latent<T>s never actually get witnessed in a resolved state.
We need to smuggle metadata from the resource IDL all the way through
to the runtime, so that it knows which things are output properties.  In
order to do this, we'll leverage decorators and the support for serializing
them as attributes.  This change adds support for the various kinds
(class, property, method, and parameter), in addition to test cases.
This change pretty-prints attribute metadata in `lumi pack info`.
For example:

    package "basic/decorators" {
        dependencies []
        module "index" {
            exports []
            method ".main": ()
            class "TestDecorators" [@basic/decorators:index:classDecorate] {
                property "a" [public, @basic/decorators:index:propertyDecorate]: string
                method "m1" [public, @basic/decorators:index:methodDecorate]: (): string
            }
        }
    }

It also includes support for printing property getters/setters:

    property "p1" [public]: string {
        method "get" [public, @basic/decorators:index:methodDecorate]: (): string
        method "set" [public]: (v: string)
    }
This change adds a @lumi.out decorator and modifies LumIDL to emit it on
output properties of resource types.  There still isn't any runtime
awareness, however, this is an isolated change that will facilitate it.
This change introduces the notion of a computed versus an output
property on resources.  Technically, output is a subset of computed,
however it is a special kind that we want to treat differently during
the evaluation of a deployment plan.  Specifically:

* An output property is any property that is populated by the resource
  provider, not code running in the Lumi type system.  Because these
  values aren't available during planning -- since we have not yet
  performed the deployment operations -- they will be latent values in
  our runtime and generally missing at the time of a plan.  This is no
  problem and we just want to avoid marshaling them in inopportune places.

* A computed property, on the other hand, is a different beast altogehter.
  Although true one of these is missing a value -- by virtue of the fact
  that they too are latent values, bottoming out in some manner on an
  output property -- they will appear in serializable input positions.
  Not only must we treat them differently during the RPC handshake and
  in the resource providers, but we also want to guarantee they are gone
  by the time we perform any CRUD operations on a resource.  They are
  purely a planning-time-only construct.
This change prepares for integrating more planning and deployment logic
closer to the runtime itself.  For historical reasons, we ended up with these
in the env.go file which really has nothing to do with deployments anymore.
This change modifies the existing resource provider RPC interface slightly.
Instead of the Create API returning the bag of output properties, we will
rely on the Get API to do so.  As a result, this change takes an initial
whack at implementing Get on all existing AWS resources.  The Get API needs
to return a fully populated structure containing all inputs and outputs.

Believe it or not, this is actually part of #90.

This was done because just returning output properties is insufficient.
Any input properties that weren't supplied may have default values, for
example, and it is wholly reasonable to expect Lumi scripts to depend on
those values in addition to output values.

This isn't fully functional in its current form, because doing this
change turned up many other related changes required to enable output
properties.  For instance, at the moment resource properties are defined
in terms of `resource.URN`s, and yet unfortunately the provider side
knows nothing of URNs (instead preferring to deal in `resource.ID`s).
I am going to handle that in a subsequent isolated change, since it will
have far-reaching implications beyond just modifying create and get.
This change overhauls our AWS resource provder to use ARNs for all
AWS resource IDs.  We have gone backwards and forwards on whether to
use the name, ID, or ARN for this purpose.  This confusion was largely
driven by the inconsistencies within the AWS APIs themselves: sometimes
a resource's name is the preferred identifier, sometimes its ID,
sometimes its ARN, and there are even cases where it differs based on
context (e.g., nondefault versus default VPC).

Thankfully, ARNs are perfectly consistent, and well-defined, on this
matter.  See http://docs.aws.amazon.com/general/latest/gr/aws-arns-and-namespaces.html.
Although any given API may request an ID or name that isn't part of the
ARN, the ARN is always sufficiently complete and lossless to enable
recovery of the information needed.  And furthermore, even if an API
doesn't want the full ARN, the resource name component of the resource's
ARN is easily parseable and useable in this role.

To facilitate this, I've created a new `arn` package that has a number
of factory and parsing helpers.

Overall, some things are more verbose -- e.g., we must always translate
from ARN space to names and vice versa -- however, using the ARN is
clarifying and guarantees that we always have a way to get the information
we need.  And in general, thanks to the removal of several workarounds in
the code, I think we come out ahead in general code delta-wise.
The change to flow logging to plugins is nice, however, it can be
annoying because all writes to stderr are interepreted on the Lumi
side as errors.  After this change, we will only flow if
--logflow is passed, e.g. as in

    $ lumi --logtostderr --logflow -v=9 deploy ...
This change skips printing output<T> properties as we perform a
deployment, instead showing the real values inline after the resource
has been created.  (output<T> is still shown during planning, of course.)
This change makes progress on a few things with respect to properly
receiving properties on the engine side, coming from the provider side,
of the RPC boundary.  The issues here are twofold:

    1. Properties need to get unmapped using a JSON-tag-sensitive
       marshaler, so that they are cased properly, etc.  For that, we
       have a new mapper.Unmap function (which is ultra lame -- see
       #138).

    2. We have the reverse problem with respect to resource IDs: on
       the send side, we must translate from URNs (which the engine
       knows about) and provider IDs (which the provider knows about);
       similarly, then, on the receive side, we must translate from
       provider IDs back into URNs.

As a result of these getting fixed, we can now properly marshal the
resulting properties back into the resource object during the plan
execution, alongside propagating and memoizing its ID.
* The EC2 instance get function needs to return security group ARNs, not raw IDs.

* The EC2 security group rule CRUD operations printed pointers and not the values.
This change remembers which properties were computed as outputs,
or even just read back as default values, during a deployment.  This
information is required in the before/after comparison in order to
perform an intelligent diff that doesn't flag, for example, the absence
of "default" values in the after image as deletions (among other things).
As I was in here, I also cleaned up the way the provider interface
works, dealing with concrete resource types, making it feel a little
richer and less like we're doing in-memory RPC.
This change fixes up a few things so that updates correctly deal
with output properties.  This involves a few things:

    1) All outputs stored on the pre snapshot need to get propagated
       to the post snapshot during planning at various points.  This
       ensures that the diffing logic doesn't need to be special cased
       everywhere, including both the Lumi and the provider sides.

    2) Names are changed to "input" properties (using a new `lumi` tag
       option, `in`).  These are properties that providers are expected
       to know nothing about, which we must treat with care during diffs.

    3) We read back properties, via Get, after doing an Update just like
       we do after performing a Create.  This ensures that if an update
       has a cascading impact on other properties, it will be detected.

    4) Inspecting a change, prior to updating, must be done using the
       computed property set instead of the real one.  This is to avoid
       mutating the resource objects ahead of actually applying a plan,
       which would be wrong and misleading.
@joeduffy joeduffy merged commit 4ec2745 into master Jun 1, 2017
@joeduffy joeduffy deleted the lumi-90-outprops branch June 1, 2017 17:28
"github.com/pulumi/lumi/pkg/resource"
)

// This file contains constants and factories for all sorts of AWS resource ARNs. In the fullness of time, it should
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put these in the individual service packages instead of centralizing them here? A bit of a shame to add another place we have to touch every time we add a new AWS service.

Copy link
Member Author

@joeduffy joeduffy Jun 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree entirely w/ the concern. In fact, I did it this way originally; however, I moved them to this central location when I realized it could be necessary for one package to use another's. E.g., maybe lambda needs to produce or consume an iam ARN. This seemed like it runs a high risk of creating cycles down the road.

Are you convinced? If not, I can split them up.

var attributes []dynamodb.Attribute
for _, attr := range tab.AttributeDefinitions {
attributes = append(attributes, dynamodb.Attribute{
Name: *attr.AttributeName,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to use aws.StringValue for this and other places where we are extracting strings from AWS API responses?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this would be cleaner, will fix this and any others I see.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this change, although to be honest, I'm conflicted about it. aws.String is useful because you can't take the address of function returns, constant strings, and so on. aws.StringValue seems like just a really long synonym for * 😉

There are definitely cases like aws.StringValueSlice that are valuable, because they compress multiple lines of boilerplate into a single function call.

For now, I'll keep going with the StringValues, but I remain on the fence...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notably, aws.StringValue returns "" for nil references, so it is different than just a long synonym for *. That said, it's not clear whether we want the case of a nil reference appearing in one of these AWS API responses to trigger a panic or to coerce to "".

return err
}
}

if err := p.waitForTableState(name, true); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is not necessary - the calls to updateTable each do this waitForTableState, as it has to reach Ready before the next update step is performed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops! That was a complete merge fail, sorry. Removed this.

@@ -97,12 +96,30 @@ func (p *applicationProvider) Create(ctx context.Context, obj *elasticbeanstalk.
if err != nil {
return "", err
}
return resource.ID(name), nil
return arn.NewElasticBeanstalkApplicationID(p.ctx.Region(), p.ctx.AccountID(), name), nil
}

// Read reads the instance state identified by ID, returning a populated resource object, or an error if not found.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment refers to Read

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch; I found a few others too (elasticbeanstalk/applicationVersion.go, elasticbeanstalk/environment.go, and lambda/function.go), and fixed those up.

} else if len(resp.Applications) == 0 {
return nil, nil
}
contract.Assert(len(resp.Applications) == 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an error instead of an assert?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we probably ought not to be asserting on any outputs from AWS APIs, just to be safe. It at least guards against our own misunderstanding of the APIs but they are fragile, inconsistent, and the risk seems really high that behavior will be incorrect.

joeduffy added a commit that referenced this pull request Jun 1, 2017
This change responds to some great feedback from @lukehoban on
#186.  Namely:

* Relax many assertions in the providers.  A failure here is pretty
  bad because it takes down the entire provider due to fail-fast.
  That said, I'd rather fail in response to a bug than let it go
  silently.  Nevertheless, a few of them were candidates for dynamic
  failures.

* Use `aws.StringValue` and friends in more places.

* Remove a superfluous waitForTableState in DynamoDB.

* Fix some erroneous references to `Read` in some `Get` comments.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants