refactor: distinguish between init and attribute types in testing state classes#2331
refactor: distinguish between init and attribute types in testing state classes#2331tonyandrewmeyer wants to merge 6 commits intocanonical:mainfrom
Conversation
…ting and types attributes will be
james-garner-canonical
left a comment
There was a problem hiding this comment.
I'm a big fan of making this change, thanks for taking care of this. I have a number of suggestions around typing and defaults, which I've made on individual lines, though they typically apply to more lines across the PR -- but I figured I'd keep the comments fewer than they'd otherwise be ...
Do you think this change warrants some additional unit tests, or are you happy that the existing tests would catch any errors in this PR? If the latter, please mention the relevant test suites.
| object.__setattr__(self, 'rotate', rotate) | ||
| object.__setattr__(self, '_tracked_revision', _tracked_revision) | ||
| object.__setattr__(self, '_latest_revision', _latest_revision) | ||
| _deepcopy_mutable_fields(self) |
There was a problem hiding this comment.
WDYT about inlining the deepcopy calls above? Looks like it would just be tracked_content, latest_content, and remote_grants.
I'm OK with leaving it as-is to keep the PR simpler if that's your preference.
There was a problem hiding this comment.
I'd prefer to leave, to keep it simpler and so there's that large comment explaining things.
There was a problem hiding this comment.
I'm increasingly skeptical of this. It would now only matter for tracked_content and latest_content if we make the remote_grants arg's values Iterable[str]. The idea of a one liner that makes everything safe is nice, but I think the abstraction makes this a lot less clear, and it would be much better to have things inlined.
For example, I was surprised to notice just now that _deepcopy_mutable_fields only copies dict and list, not the other built in mutable collection set (which I think would ideally be made a frozenset in that method, but probably can't be right now in the general case for backwards compatibility, with the same reasoning that it doesn't convert list to tuple).
My point is, you have to read it to check that we're making things immutable correctly anyway, so it doesn't actually make this simpler for readers.
There was a problem hiding this comment.
And we don't actually need deep copies for tracked_content and latest_content since they're flat Mapping[str, str], so we can just call dict in-line. This would be consistent with what we're now doing in Network.__init__.
In terms of not breaking the In terms of tests for the changes, I'm not super keen on having tests like: c = CloudCredential(auth_type="foo", redacted=['a', 'b', 'c'])
assert isinstance(c.redacted, list)I know we have some tests where we expect pyright to find issues, but I'm not sure it's the right move to add something like that for this either. Do you have any suggestions in terms of tests? |
Nice!
I wouldn't mind seeing tests a bit like the one you're not keen on, explicitly encoding (from a user perspective) the type conversion and copying behaviour that we're implementing. redacted = ['a', 'b', 'c']
...
c = CloudCredential(auth_type="foo", redacted=redacted, ...)
assert isinstance(c.redacted, list) # or tuple if we go that way
assert c.redacted == redacted
assert c.redacted is not redacted
...The existing tests probably do cover a lot of this, but a lot of them are hard to follow at a glance due to the parametrization and abstraction. |
dimaqq
left a comment
There was a problem hiding this comment.
I kinda like this.
If it works, it's fair to merge :)
Happy to leave the details to James.
|
@james-garner-canonical brought up the excellent point that these are frozen dataclasses that we want people to treat as immutable. So giving their type checker information that they have a list (rather than an immutable Sequence) or a dict (rather than an immutable Mapping) leads them to where we don't want to go. So rejecting this instead. |
…ave type checkers alert to mutating the state.
|
@james-garner-canonical I've adjusted per the discussion we had earlier in the week, and this should be good for reviewing again now, thanks! |
james-garner-canonical
left a comment
There was a problem hiding this comment.
I really like the direction here, and I definitely think it's worth making these changes to decouple the __init__ argument typing from the attribute typing.
I've flagged a number of items that I think require some further thought before merging.
| """ | ||
|
|
||
| remote_grants: Mapping[int, set[str]] = dataclasses.field(default_factory=dict) | ||
| remote_grants: Mapping[int, set[str]] |
There was a problem hiding this comment.
I think this would ideally be frozenset[str] for immutability, but that would be a backwards incompatible change.
| remote_grants: Mapping[int, set[str]] | |
| remote_grants: Mapping[int, set[str]] # ideally frozenset[str] but set[str] for backwards compatibility |
And/or add this to the list of things to fix in the next breaking release?
There was a problem hiding this comment.
Though it would be neat if we could get away with making it Collection[str] in typing and frozenset at runtime without breaking anyone ...
| latest_content: RawSecretRevisionContents | None = None, | ||
| id: str | None = None, | ||
| owner: Literal['unit', 'app'] | None = None, | ||
| remote_grants: Mapping[int, set[str]] = {}, |
There was a problem hiding this comment.
We shouldn't enforce set[str] for the init argument IMO, since we'd really like it to be frozenset on the attribute in future.
| remote_grants: Mapping[int, set[str]] = {}, | |
| remote_grants: Mapping[int, Iterable[str]] = {}, |
This would require us to change how we set the remote_grants attribute below, like this:
object.__setattr__(self, 'remote_grants', {k: set(v) for k, v in remote_grants.items()})| ) | ||
| object.__setattr__(self, 'id', id if id is not None else _generate_secret_id()) | ||
| object.__setattr__(self, 'owner', owner) | ||
| object.__setattr__(self, 'remote_grants', dict(remote_grants)) |
There was a problem hiding this comment.
| object.__setattr__(self, 'remote_grants', dict(remote_grants)) | |
| # Ideally we'd use frozenset(v) but changing from set would be backwards incompatible. | |
| object.__setattr__(self, 'remote_grants', {k: set(v) for k, v in remote_grants.items()}) |
| object.__setattr__(self, 'rotate', rotate) | ||
| object.__setattr__(self, '_tracked_revision', _tracked_revision) | ||
| object.__setattr__(self, '_latest_revision', _latest_revision) | ||
| _deepcopy_mutable_fields(self) |
There was a problem hiding this comment.
I'm increasingly skeptical of this. It would now only matter for tracked_content and latest_content if we make the remote_grants arg's values Iterable[str]. The idea of a one liner that makes everything safe is nice, but I think the abstraction makes this a lot less clear, and it would be much better to have things inlined.
For example, I was surprised to notice just now that _deepcopy_mutable_fields only copies dict and list, not the other built in mutable collection set (which I think would ideally be made a frozenset in that method, but probably can't be right now in the general case for backwards compatibility, with the same reasoning that it doesn't convert list to tuple).
My point is, you have to read it to check that we're making things immutable correctly anyway, so it doesn't actually make this simpler for readers.
| object.__setattr__(self, 'service_statuses', dict(service_statuses)) | ||
| object.__setattr__(self, 'mounts', dict(mounts)) | ||
| object.__setattr__(self, 'execs', frozenset(execs)) | ||
| object.__setattr__(self, 'notices', list(notices)) |
There was a problem hiding this comment.
We're taking our typing queue from the original default factory here. What's not immediately clear to me is whether this is a case like the other list ones where we typed as Sequence but really want to guarantee list (at least for equality comparisons?), so a local comment would be nice IMO.
| def __post_init__(self): | ||
| if not isinstance(self.execs, frozenset): | ||
| # Allow passing a regular set (or other iterable) of Execs. | ||
| object.__setattr__(self, 'execs', frozenset(self.execs)) |
There was a problem hiding this comment.
Interestingly it looks like we previously didn't convert check_infos huh?
| object.__setattr__(self, 'content', dict(content)) | ||
| object.__setattr__(self, '_data_type_name', _data_type_name) | ||
| _deepcopy_mutable_fields(self) |
There was a problem hiding this comment.
| object.__setattr__(self, 'content', dict(content)) | |
| object.__setattr__(self, '_data_type_name', _data_type_name) | |
| _deepcopy_mutable_fields(self) | |
| object.__setattr__(self, 'content', copy.deepcopy(content)) | |
| object.__setattr__(self, '_data_type_name', _data_type_name) |
| relations: Iterable[RelationBase] = dataclasses.field(default_factory=frozenset) | ||
| relations: frozenset[RelationBase] |
There was a problem hiding this comment.
Should we use Collection instead for all these attributes?
| @pytest.mark.parametrize( | ||
| 'component,attribute,expected_type,input_value,required_args', | ||
| [ | ||
| # Mapping -> dict | ||
| (CloudCredential, 'attributes', dict, {'a': 'b'}, {'auth_type': 'foo'}), | ||
| (Secret, 'remote_grants', dict, {1: {'app'}}, {'tracked_content': {'k': 'v'}}), | ||
| (Notice, 'last_data', dict, {'k': 'v'}, {'key': 'foo'}), | ||
| (Container, 'layers', dict, {}, {'name': 'foo'}), | ||
| (Container, 'service_statuses', dict, {}, {'name': 'foo'}), | ||
| (Container, 'mounts', dict, {}, {'name': 'foo'}), | ||
| (StoredState, 'content', dict, {'k': 'v'}, {}), | ||
| # Iterable -> list | ||
| (CloudCredential, 'redacted', list, ('a', 'b'), {'auth_type': 'foo'}), | ||
| (CloudSpec, 'ca_certificates', list, ('a', 'b'), {'type': 'foo'}), | ||
| ( | ||
| Network, | ||
| 'bind_addresses', | ||
| list, | ||
| iter([BindAddress([Address('192.0.2.0')])]), | ||
| {'binding_name': 'foo'}, | ||
| ), | ||
| (Network, 'ingress_addresses', list, ('1.2.3.4',), {'binding_name': 'foo'}), | ||
| (Network, 'egress_subnets', list, ('1.2.3.0/24',), {'binding_name': 'foo'}), | ||
| (Container, 'notices', list, (Notice(key='foo'),), {'name': 'foo'}), | ||
| (State, 'deferred', list, (), {}), | ||
| # Iterable -> frozenset | ||
| (Container, 'execs', frozenset, (), {'name': 'foo'}), | ||
| (Container, 'check_infos', frozenset, (), {'name': 'foo'}), | ||
| (State, 'relations', frozenset, (Relation(endpoint='foo'),), {}), | ||
| (State, 'networks', frozenset, (Network(binding_name='foo'),), {}), | ||
| (State, 'containers', frozenset, (Container(name='foo'),), {}), | ||
| (State, 'secrets', frozenset, (Secret(tracked_content={'k': 'v'}),), {}), | ||
| (State, 'stored_states', frozenset, (), {}), | ||
| ], | ||
| ) | ||
| def test_init_converts_to_concrete_type( | ||
| component: type[object], | ||
| attribute: str, | ||
| expected_type: type, | ||
| input_value: Any, | ||
| required_args: dict[str, Any], | ||
| ): | ||
| """Verify that __init__ converts broader input types to concrete attribute types.""" | ||
| obj = component(**required_args, **{attribute: input_value}) | ||
| assert isinstance(getattr(obj, attribute), expected_type) |
There was a problem hiding this comment.
I understand the intent here, but reading it breaks my brain a little.
Minor suggestion: reorder params to component, required_args, attribute, input_value, expected_type to follow the logic of the test.
More radical suggestion: have AI unroll this into a series of unparametrised tests like:
def test_cloud_credential_init_converts_args():
obj = CloudCredential(
auth_type='foo', # required
attributes={'a', 'b'},
redacted=('a', 'b'),
)
assert obj.attributes == ['a']
assert obj.redacted == ['a', 'b']
When designing the Scenario 7 API we introduced kw-only args, and originally had custom
__init__for each state class to support that. We decided to change that because it felt busy and a lot of maintenance.However, we currently have an unfortunate mismatch between some of the types accepted to create an instance of a state class and the type the corresponding attribute will be. For example, the init might accept any
Mappingbut we know the attribute will always be adict. It would be nice to provide that information to users.Now that we are using Python 3.10+, we do have some classes without this issue that can continue to use the dataclasses generated
__init__. However, there are many that would be better as more explicit, and I am not convinced it's too much work to maintain.We opened the door to this in #2274 adjusting
CheckInfo. This PR applies the same improvement to the rest of the state classes.Fixes #2152