Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python SDK changes to support input/output classes #5033

Merged
merged 12 commits into from
Aug 19, 2020
Merged

Conversation

justinvp
Copy link
Member

@justinvp justinvp commented Jul 21, 2020

SDK changes to support strongly-typed input/output "dataclasses".

PR for provider codegen: #5034

Example diffs:

Fixes #4789
Fixes #5140

@justinvp justinvp marked this pull request as ready for review July 23, 2020 16:19
@justinvp
Copy link
Member Author

There are some remaining TODOs, but this can be reviewed in the meantime.

sdk/python/lib/pulumi/output.py Outdated Show resolved Hide resolved
tests/integration/dynamic/python_types/__main__.py Outdated Show resolved Hide resolved
tests/integration/integration_test.go Outdated Show resolved Hide resolved
sdk/python/lib/pulumi/output.py Outdated Show resolved Hide resolved
sdk/python/lib/pulumi/runtime/rpc.py Outdated Show resolved Hide resolved
sdk/python/lib/pulumi/runtime/rpc.py Outdated Show resolved Hide resolved
Comment on lines +99 to +358
# Clean-up class attributes.
for name in props:
# If the class attribute (which is the default value for this prop)
# exists and is of type 'Property', delete the class attribute so
# it is not set at all in the post-processed class.
if isinstance(getattr(cls, name, None), _Property):
delattr(cls, name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this a bit more? I'll admit I'm a bit lost here :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can embellish in a comment, but in the meantime, here's an explanation:

Quick summary of dataclass feature

The design and implementation is inspired and based on the built-in dataclass feature, where you can define classes like:

@dataclass
class C:
    x: int
    y: int = field(repr=False)
    z: int = field(repr=False, default=10)
    t: int = 20

Relevant quotes from https://docs.python.org/3/library/dataclasses.html#dataclasses.dataclass:

The dataclass() decorator examines the class to find fields. A field is defined as class variable that has a type annotation.

[...]

fields may optionally specify a default value, using normal Python syntax:

@dataclass
class C:
   a: int       # 'a' has no default value
   b: int = 0   # assign a default value for 'b'

And https://docs.python.org/3/library/dataclasses.html#dataclasses.field:

For common and simple use cases, no other functionality is required. There are, however, some dataclass features that require additional per-field information. To satisfy this need for additional information, you can replace the default field value with a call to the provided field() function. For example:

@dataclass
class C:
    mylist: List[int] = field(default_factory=list)

c = C()
c.mylist += [1, 2, 3]

[...]

If the default value of a field is specified by a call to field(), then the class attribute for this field will be replaced by the specified default value. If no default is provided, then the class attribute will be deleted. The intent is that after the dataclass() decorator runs, the class attributes will all contain the default values for the fields, just as if the default value itself were specified. For example, after:

@dataclass
class C:
    x: int
    y: int = field(repr=False)
    z: int = field(repr=False, default=10)
    t: int = 20

The class attribute C.z will be 10, the class attribute C.t will be 20, and the class attributes C.x and C.y will not be set.

Our design

We're allowing output types to be specified in a similar way:

@pulumi.output_type
class Foo:
    my_arg: str = pulumi.property("myArg")
    some_other_value: bool = pulumi.property("someOtherValue")

Dataclasses uses "fields" as its terminology, but for us, I'm referring to these as "properties". Our "properties" can similarly be defined as class variables with type annotations. Our pulumi.property() function is analogous to the dataclasses.field() function, and is used to provide additional information associated with the "property": in this case, the Pulumi camelCase name (pun, omg, 😆).

We first determine the class's "properties" in _properties_from_annotations() by looking at the class's __annotations__ (which is the dict {"my_arg": str, "some_other_value": bool} for the example above) and then look up the class attribute value for each key in that dict. In this example, the value of the my_arg class attribute is an instance of _Property(name="myArg"), and the value of some_other_value is _Property(name="someOtherValue"). We do some more massaging, store the type annotation in the _Property instance, and then _properties_from_annotations() returns a dictionary of the class attribute names to _Property instances.

Then the decorator "cleans up" the class, by removing any of the class attributes that were set to _Property via a call to pulumi.property().

Dataclasses does the same thing, cleaning up the class attributes that were set to values returned from calls to field().

Comment on lines 275 to 279
# If the class has a _translate_property() method, use it to translate
# property names, otherwise, use an identity function.
translate = getattr(cls, _TRANSLATE_PROPERTY, None)
if not callable(translate):
translate = lambda self, prop: prop
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that we're translating from Python -> Pulumi names at runtime inside a property getter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's at runtime, but this is actually translating the other way from Pulumi -> "Python (mapping table)" names, and it's only done for output types.

This is how output types work in this PR:

When unmarshaling the outputs, dictionary keys are translated using the resource's translate_output_property() function (as is the case today). If the dict is intended to be an output class, we instantiate the class, passing the translated dict to the output classes __init__().

The output classes that we generate for our providers inherit from dict, so when we pass the translated dict to the output class's __init__(), it's actually just copying the translated dict into itself (if the output class didn't inherit from dict, it saves a reference to the translated dict as self._values). At this point the object has the translated keys, which is what we want to maintain backwards compatibility. Existing programs accessing the values as a dict can continue to do so, using the translated key names.

But these objects now have a nicer way of accessing the values, using the new strongly named/typed Python properties that we've added. These properties need to access the values, and since the keys are the translated keys, our provide codegen also generates a _translate_property() method on the output class, which does the same things as the resource's translate_output_property(): translates Pulumi -> "Python (mapping table)" names. If we see we have a _translate_property(), we call it to translate the key to lookup the value.

Copy link
Member

@komalali komalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is kinda difficult to digest, perhaps a comment with the overall strategy would be useful at the top of _types.py. Otherwise, I'll defer to the feedback from other folks since I'm out for the next week. Glad to see this all coming together!

sdk/python/lib/pulumi/_utils.py Outdated Show resolved Hide resolved
sdk/python/lib/pulumi/_utils.py Outdated Show resolved Hide resolved
@justinvp
Copy link
Member Author

@pgavlin, I've addressed your initial feedback, and also added a commit to this PR that instantiates nested output types for calls to invoke. PTAL and let me know if you have any other feedback.

justinvp and others added 9 commits August 18, 2020 11:01
This commit adds an optional type parameter to `invoke`. When specified,
the type will be instantiated along with any nested types.
This enables docstrings to show up when hovering over properties in PyCharm and VS Code (w/ Pylance).

The property values are stored in and retrieved from the object's `__dict__`, which allows existing things like `export("res", res)` to continue to work the same as before, and aligns the input/output types to store values in `__dict__` the same way.
Two main changes:

1. Previously, if an @output_type class didn't have an __init__ method,
one would be added by the decorator that accepted a single dict argument
representing the output properties. This commit makes it so that instead
of accepting a single dict argument, it accepts individual arguments for
each property based on annotations defined on the class. Classes
decorated with @input_type now do the same thing.

2. Values in input/output types are now stored in `self.__dict__` as
snake_case Python names, rather than being stored in `self._values` as
camelCase Pulumi names, to be consistent with how values are stored for
resources. This allows these types to work the same as resources when
passed to `pulumi.export`. There are two exceptions for output types:
(a) if the class is a subclass of dict, the values are stored in the
dict itself (rather than __dict__), and (b) if the output class has a
`_translate_property` method, the names will be the translated names
from calling that method.
Co-authored-by: Komal <komal@pulumi.com>
I forgot to update these tests after updating the output types to store values as snake_case Python names (unless there is a _translate_property). Minor cleanup while making the changes here.
This avoids raising type mismatch errors when the deserialized output type is a dict or list but the type annotation is non-generic Output.
@justinvp justinvp merged commit cd9fae5 into master Aug 19, 2020
@pulumi-bot pulumi-bot deleted the justin/pyio_sdk branch August 19, 2020 08:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Python: Instantiate output types for calls to invoke Support serializing/deserializing Python classes
5 participants