Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sdk/nodejs] Unmarshal output values in component provider #8205

Merged
merged 2 commits into from
Nov 15, 2021

Conversation

justinvp
Copy link
Member

@justinvp justinvp commented Oct 12, 2021

This adds support for unmarshaling output values in the Node.js provider.

Fixes #8155

@justinvp justinvp force-pushed the justin/outputvalues_nodejs branch 2 times, most recently from 95aea32 to f363218 Compare October 13, 2021 20:48
@justinvp justinvp marked this pull request as ready for review October 13, 2021 20:49
@justinvp justinvp requested review from iwahbe and a team October 13, 2021 21:06
Comment on lines 499 to 506
// If the property dependencies are equal to or a subset of the gathered nested
// dependencies, we don't need to create a top-level output for the property
// because any nested output values will have already been deserialized as outputs.
const gatheredDeps = gatheredPropDeps.get(k);
if (!isSecret && gatheredDeps && gatheredDeps.size > 0 && contains(gatheredDeps, depsUrns)) {
result[k] = input;
continue;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should figure out what exactly we want to do here. I'd love to deprecate the input dependencies map and just rely on first-class output values. Do you think that's realistic? If so, what do we need to do to get there? If it's not realistic, we really should write down the semantic if input dependencies are present (and what you have here may be fine in that case).

Copy link
Member Author

@justinvp justinvp Nov 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've explored an approach where the program SDK omits the deps map when sending output values to the engine. As well as having the engine omit sending the deps map when sending output values to the provider. The engine would need to be able to recreate the deps map from the output values when sending to an older provider that doesn't accept output values.

@pgavlin, it's this last part where I've run into a problem that I'd like your thoughts on.

There's an edge case where I can't fully recreate the deps map from output values alone in the engine. The program SDKs produce the deps map with deps from both outputs as well as resource refs. The problematic case is when a property contains a resource ref to a regular (non-remote) component. In that case, the resource ref only contains the URN of the component, but the deps map the program SDK produces includes the expanded transitive deps of the component. For example:

class Component extends pulumi.ComponentResource {
    constructor(name: string, opts?: pulumi.ComponentResourceOptions) {
        super("pkg:index:Component", name, {}, opts);
        new random.RandomPet("inner1", {}, { parent: this });
        new random.RandomPet("inner2", {}, { parent: this });
    }
}

const comp = new Component("component");

const remote = new RemoteComponent("remotecomponent", {
    tags: {
        foo: comp,
    },
});

The input properties of remotecomponent:

resource.PropertyMap{
    "tags": {
        V:  resource.PropertyMap{
            "foo": {
                V:  resource.ResourceReference{
                    URN:            "urn:pulumi:dev::program::pkg:index:Component::component",
                    ID:             resource.PropertyValue{},
                    PackageVersion: "",
                },
            },
        },
    },
}

And the property deps map the program SDK produces:

map[resource.PropertyKey][]resource.URN{
    "tags": {
        "urn:pulumi:dev::program::pkg:index:Component$random:index/randomPet:RandomPet::inner1",
        "urn:pulumi:dev::program::pkg:index:Component$random:index/randomPet:RandomPet::inner2",
    },
}

If the program SDK omits the deps map and the engine needs to create it to send to an older provider that doesn't accept output values, there's no way to create the same deps map that the program SDK would have created, because the engine only has the URN of the component in the resource ref, not its transitive deps.

Here are some ways we could potentially address this:

  1. Include a component's expanded transitive deps in its resource ref. The engine could then fully recreate the deps map when it needs to. The program SDKs could add an optional dependencies key to the special resource ref object when serializing regular components, the value of which would be the list of the component's expanded transitive deps.

  2. Continue to always have the program SDKs send the deps map to the engine (the SDKs need to do it anyway for the cases where output values aren't currently being sent (i.e. remote isn't true)). If a provider accepts output values, the engine could then figure out whether it can omit the deps map to the provider, which it should be able to do the majority of the time, only sending the deps map to the provider for the edge case with component resource refs.

  3. Variant of (2) where program SDKs omit the deps map when sending output values to the engine the majority of the time unless it runs into this edge case with components. Program SDKs could even send a trimmed version of the deps map that only includes entries for the properties that needs it for the component resource ref. And/or the engine sends a trimmed version of the deps map to the provider only for properties in this edge case.

I think (1) seems like a reasonable approach, but curious what others think. Also curious if anyone has any other ideas.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having a very hard time reasoning about this, I wish for a simpler system. I still am bullish on the https://docs.google.com/document/d/1ZXNjuNtekpGjXGgMDjEgbjPemRsDPMmfMMGoS9onMSM/edit?usp=sharing proposal ~ if we could get to a place where only the engine does transitive closure expansion, and SDKs/providers only worry about preserving dep edges collected out of options and Output control flow analysis, that'd be good I think.

there's no way to create the same deps map that the program SDK would have created, because the engine only has the URN of the component in the resource ref, not its transitive deps.

Do they have to be the same maps?

because the engine only has the URN of the component in the resource ref, not its transitive deps.

What bad happens if the transitive deps are omitted?

As far as I can discern, the primary purpose of deps is to order RegisterResource calls so things are provisioned in the correct order, and then have enough of the graph in place after all is said and done to do deletions correctly. Perhaps in this case, since we're already in the engine, this guarantees that RegisterResource is ordered correctly even if we omit transitive deps?

Copy link
Member Author

@justinvp justinvp Nov 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent points, @t0yv0! Thanks for your comments. @pgavlin and I discussed it and we think we don't need the transitive deps here, since all children will be ready as part of marshaling. Deletes will still be correct.

The only reason we might want to have the transitive deps would be to make the dep behavior inside the MLC even more consistent with how it'd be outside the MLC, but we don't think it's necessary.

Based on my discussion with @pgavlin, I'm going to proceed with the following:

  1. Omit sending the deps map to the engine when sending output values.
  2. Don't add dependencies to the resource ref. (If we do find we need them, we can always add later in the resource ref as I suggested).
  3. For the Construct call from the engine to the provider, if the provider doesn't support output values, we'll "polyfill" (i.e. recreate) the deps map, filling in the information we do have.

Separately, I'm going to look into whether we can make a clean break and just require output value marshaling support across the SDK, CLI, and provider in order to use MLC. It may be early enough in MLC usage where such a break is reasonable. In which case we don't need to keep the polyfill.

And for near term future iterations, consider broadening our usage of output values so we can deprecate these dependency maps elsewhere in the system.

Lastly, there was another potential issue that came up in our conversation: the way we currently deserialize resource refs in Node.js and Python is to fallback to returning either the string id or urn when a resource module was not found. Instead, we likely want to change that to return an DependencyResource (which is what we do in Go and .NET). Update: Opened #8417 to track this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#8410 makes the changes to deprecate the dependency maps (it can be merged after this PR), and I updated this PR to be simpler (no longer gathering dependencies during deserialization, instead only checking for empty deps or if the inputs contain outputs). I've also added commits to #8212 for Python to match this PR.

sdk/nodejs/runtime/rpc.ts Outdated Show resolved Hide resolved
Copy link
Member

@t0yv0 t0yv0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM, tests show that output values can now deserialize. Had some questions for better understanding. Thanks.

This adds support for unmarshaling output values in the Node.js provider.
Copy link
Member

@iwahbe iwahbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks familiar. 😃 LGTM

@justinvp justinvp merged commit 10ceee4 into master Nov 15, 2021
@pulumi-bot pulumi-bot deleted the justin/outputvalues_nodejs branch November 15, 2021 19:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Output Values: Node SDK unmarshaling (used in MLC provider)
4 participants