-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fields annotated as Dict now has their struct flag set to False by default #581
Conversation
8c792dc
to
ad9580c
Compare
This pull request introduces 2 alerts when merging 0b362b6 into 7dbb11a - view on LGTM.com new alerts:
|
0b362b6
to
170786f
Compare
170786f
to
b096f3f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides some minor typos, I'm going to need to better understand this struct
stuff in order to review this PR
@@ -906,6 +906,9 @@ def _node_wrap( | |||
) | |||
if is_dict: | |||
key_type, element_type = get_dict_key_value_types(type_) | |||
flags = {} | |||
if is_dict_annotation(type_): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused with the distinction between type_
and ref_type
in this function: they are both set to the same value in the code calling it, but tests seem to use different values sometimes. Also in this function there are both occurrences of ref_type=type_
and ref_type=ref_type
, which doesn't help understanding what is going on.
Would it be possible to clarify in a docstring or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both type_
and ref_type
are constant within this function's body.
We can refactor by replacing each type_
with ref_type
and dropping type_
from the function signature.
Nevermind, @odelalleau you make a good point about the tests calling with different values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is complicated and the desired behavior is driven by the tests.
At a high level ref_type is equivalent to an annotation type and object_type is, well, the object type.
x : Any = "str"
In the above, the ref_type of x would be Any and the object_type would be str.
ref_type is usually used when assigning an object:
x : str = "str"
x = 10 # will trigger a validation error.
# in actual OmegaConf code:
cfg = OmegaConf.create({"x" : StringNode("str")})
cfg.x = 10
# or with dataclass:
@dataclass
class Foo:
x : str = "str"
cfg = OmegaConf.structured(Foo)
cfg.x = 10
There are some scenarios, especially during merge - where we use the ref_type for more things.
@@ -906,6 +906,9 @@ def _node_wrap( | |||
) | |||
if is_dict: | |||
key_type, element_type = get_dict_key_value_types(type_) | |||
flags = {} | |||
if is_dict_annotation(type_): | |||
flags["struct"] = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not directly related to this change, but I was trying to understand how things are working, and I'm confused by this:
cfg = OmegaConf.create({"a": {"b": 0}})
OmegaConf.set_struct(cfg, True)
cfg.a = {"b": 1}
which raises this exception:
ConfigKeyError: Key 'b' is not in struct
full_key: a.b
object_type=dict
What's going on there? (it's the same on master btw)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would call it a bug.
file an issue. I am expecting this assignment to succeed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, at higher level, it feels a bit like a hack, and I tend to agree with what you said in #461: "From the perspective of OmegaConf, this does not make sense". What would make more sense to me would be to support "real" Python dicts / lists as ValueNodes. Something like: @dataclass
class Example:
real_dict: Dict[str, int]
dict_config: DictConfig[str, int] The first one ( That'd be a big change though (and I definitely haven't thought it through)... so I won't be pushing back against this PR, but I still wanted to mention it. |
Co-authored-by: Olivier Delalleau <507137+odelalleau@users.noreply.github.com>
Co-authored-by: Olivier Delalleau <507137+odelalleau@users.noreply.github.com>
Some context about the evolution of OmegaConf. Initially, there was no type safety at all. The behavior was simialar to a plain python dict, with some additions like interpolation, support for missing values, etc. When it was introduced, DictConfig access of a none existing field was returning None (this is changing in 2.1), and assignment of none existing fields created them as expected. The next step is Structured Configs. In some scenarios, we want a dataclass that both defines some fields but also allow the assignment of arbitrary new fields. The high level bit for Structured Configs extending Dict, is that they are open to changed even though the node is a Structured Config. Another thing that is supported is a field that is typed as Dict[K, V]. In both of of the above cases, Dict is used to soften the strictness of Structured Configs. If you think about the evolution of OmegaConf it would make more sense but I do agree it's not obviously the right choice. I will take a second look at this to see if there is an alternative way to support the behavior I need, in particular I am thinking to introduce a mechanism that will mark a node as flags root and will stop the recursive querying of parents. |
Thanks a lot for the detailed extra context, that helps! (I'll re-read this PR later in light of this -- will probably happen only tomorrow though) Just a quick question though: what's the motivation for not setting the |
The reason I added OmegaConf.structured() was to support ducktyping: user: User = OmegaConf.structured(User). Your question is actually why is the struct flag not set to True on Structured Config nodes. I think this could have been implemented that way had I introduced non recursive node flags as well. |
I just wanted to respond to it: You are trying to (ab)use a different notation to indicate that some things should not happen on the node. What about interpolations? what about type safety? is In short, I don't think this is a good direction. |
Fair enough. Just to quickly answer your question, this would have been a new subclass of |
If I understand you correctly, this means it would get assignment validation only, but later it will be allowed to drift from the schema. |
I am closing this in favor of the more subtle #588. |
That's correct (static type checking should still work though, similar to regular Python dicts) |
What about merging from the command line? |
Use a I'm definitely not saying these should replace the current DictConfig mechanics. Just that I can see a reason to support plain dicts as well (maybe because we want to rely on some features not implemented in DictConfig, or want to avoid some DictConfig features leaking to the dict -- like the struct flag -- or for performance reasons) |
Python supports plain dict plenty. |
close #461
The expectation of a Structured Config with a Dict field is that the dict field is free to accept new keys:
This expectation is violated if the Structured Config has a parent in struct mode.
This PR is fixing the issue by explicitly setting DictConfig typed as Dict[K, V] to have a struct flag set to False.
There could be some side effects to this (this will open up any non DictConfig children (not structured configs) of the dict), but that seems pretty unlikely to cause a real issue in practice.
This is a potential use case for non recursive config flags, but for now I am parking that idea and will revisit if there is better use case.