Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Optional[type] support #749

Merged
merged 49 commits into from Nov 15, 2021
Merged

Improve Optional[type] support #749

merged 49 commits into from Nov 15, 2021

Conversation

Jasha10
Copy link
Collaborator

@Jasha10 Jasha10 commented Jun 11, 2021

This PR is to improve validation for Optional[...] typing.
The main contribution is to allow element_type to be optional, e.g.

DictConfig({}, element_type=Optional[int])

is now legal.

Summary of changes:

  • In the wrap function, check whether the parent's _metadata.element_type is optional. This means that, for the new child value, is_optional==True if element_type==Any and is_optional==False if e.g. element_type==int. This closes When validating the assignment to a list, None is not properly validated #579.
  • Modify _utils.valid_value_annotation_type so that Optional[...] values are considered valid. Check valid_value_annotation_type(element_type) in ContainerMetadata.__post_init__. This closes Dict[str, Optional[str]] fails #460.
  • In ContainerMetadata.__post_init__: replace an assertion with a raise ValidationError; remove redundant raise ValidationError statements from ListConfig.__init__ and DictConfig.__init__.
  • Use _utils._resolve_optional as needed to adapt code for optional element_type.
  • Updates to OmegaConf.merge such that optional element_type is handled properly.

tests/test_merge.py Outdated Show resolved Hide resolved
tests/test_optional.py Outdated Show resolved Hide resolved
tests/test_optional.py Outdated Show resolved Hide resolved
tests/test_optional.py Outdated Show resolved Hide resolved
@Jasha10 Jasha10 changed the title wrap: check if elt type is optional Improve Optional[type] validation Jun 11, 2021
Copy link
Collaborator

@odelalleau odelalleau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks great, just a few small things (NB: there's also a pickling error with 3.6 to look into)

omegaconf/base.py Show resolved Hide resolved
tests/test_typing.py Outdated Show resolved Hide resolved
tests/test_typing.py Outdated Show resolved Hide resolved
# merging into a new node. Use element_type as a base
dest[key] = DictConfig(content=dest._metadata.element_type, parent=dest)
dest[key] = DictConfig(
et, parent=dest, ref_type=et, is_optional=is_optional
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More of a meta question for @omry: should a DictConfig created from a structured config type automatically set its ref_type to that type? (here, that would mean not needing to specify ref_type=et)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the context of merge, I think the ref_type should be preserved from the target node.
If there is no target node (new node), I think the ref type should be Any (default, irrc).

The concern is that this will change the assignment semantics of the node.

@dataclass
class Config:
  x : int = 10

cfg =  OmegaConf.create({
  "a": Config(), 
  "b": DictConfig(Config(), ref_type=Config)
})

cfg.a = "aaa" # ok
cfg.b = "aaa" # error

@Jasha10, what is the motivation for specifying the ref_type here?

Copy link
Collaborator Author

@Jasha10 Jasha10 Jun 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the context of merge, I think the ref_type should be preserved from the target node.

Agreed. This PR is consistent with preserving ref_type from the target node.

If there is no target node (new node), I think the ref type should be Any (default, irrc).

Usually, but I think that if parent's element_type is set
then (for a new node) we should have parent_element_type == child_ref_type.
For example:

from typing import Any
from omegaconf import DictConfig, OmegaConf
from tests import User

cfg = OmegaConf.merge(DictConfig({}, element_type=User), {"a": User("Bond")})

# On this branch:
assert cfg.a._metadata.ref_type == User

# On master branch:
assert cfg.a._metadata.ref_type == Any

This change is consistent with how assignment works: child_ref_type is
inherited from parent_element_type.

cfg2 = DictConfig({}, element_type=User)
cfg2.a = User("Bond")
# On this branch AND master branch:
assert cfg2.a._metadata.ref_type == User

tests/test_typing.py Outdated Show resolved Hide resolved
@Jasha10
Copy link
Collaborator Author

Jasha10 commented Jun 13, 2021

Overall looks great, just a few small things (NB: there's also a pickling error with 3.6 to look into)

Apparently Python 3.6 pickling doesn't play well with Optional:

$ python
Python 3.6.13
>>> from typing import Optional
>>> import pickle
>>> pickle.dumps(Optional[int])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
_pickle.PicklingError: Can't pickle typing.Union[int, NoneType]: it's not the same object as typing.Union

The CI failure was triggered because I added fields to the tests.SubscriptedList and tests.SubscriptedDict dataclasses, which are used in test_serialization.py to test pickling.

@Jasha10
Copy link
Collaborator Author

Jasha10 commented Jun 13, 2021

Apparently Python 3.6 pickling doesn't play well with Optional:

Resolved in e2b2800 using pytest.skipif for the cases that involve pickling Optional.

omegaconf/listconfig.py Outdated Show resolved Hide resolved
@Jasha10 Jasha10 marked this pull request as ready for review June 13, 2021 23:53
# merging into a new node. Use element_type as a base
dest[key] = DictConfig(content=dest._metadata.element_type, parent=dest)
dest[key] = DictConfig(
et, parent=dest, ref_type=et, is_optional=is_optional
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the context of merge, I think the ref_type should be preserved from the target node.
If there is no target node (new node), I think the ref type should be Any (default, irrc).

The concern is that this will change the assignment semantics of the node.

@dataclass
class Config:
  x : int = 10

cfg =  OmegaConf.create({
  "a": Config(), 
  "b": DictConfig(Config(), ref_type=Config)
})

cfg.a = "aaa" # ok
cfg.b = "aaa" # error

@Jasha10, what is the motivation for specifying the ref_type here?

tests/test_errors.py Outdated Show resolved Hide resolved
tests/test_typing.py Outdated Show resolved Hide resolved
tests/test_typing.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@odelalleau odelalleau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good for me once ongoing discussions with @omry are resolved, thanks!

@Jasha10 Jasha10 marked this pull request as draft September 4, 2021 22:36
Copy link
Collaborator

@odelalleau odelalleau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments on the latest commit

tests/test_merge.py Outdated Show resolved Hide resolved
tests/test_merge.py Outdated Show resolved Hide resolved
tests/test_merge.py Outdated Show resolved Hide resolved
),
param(
(DictConfig(content={}, element_type=User), {"foo": None}),
raises(ValidationError, match="child 'foo' is not Optional"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment: may be worth looking at the full error message to make sure it makes sense (the excerpt being tested could be confusing)

Copy link
Collaborator Author

@Jasha10 Jasha10 Sep 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 0d8196c I've updated the tests to match the full error message (excluding the one with id="new_str_none", which is currently not passing).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks -- though personally I tend to prefer checking snippets to make tests more concise (leaving full checks to test_errors.py) -- but anyway the new version looks good to me.

This actually confirms that this error message ("child 'foo' is not Optional") is a bit confusing (IMO). I can see where it may be coming from (we first add a foo node of type User then try to set it to None), but a clearer error could be something like: "Can't merge None values into a DictConfig whose element type is not Optional".

Definitely not something to change in this PR (and may not be worth changing it at all) -- I just thought I'd mention it because it caught my eye :)

(edit: feel free to resolve)

@omry
Copy link
Owner

omry commented Sep 27, 2021

Python 3.6 eol is coming in a couple of months: https://endoflife.date/python

It would be good to issue a better error than Can't pickle typing.Union[int, NoneType]: it's not the same object as typing.Union on Python 3.6 when we run into this problem.
Furthermore, we can consider dropping formal Python 3.6 support for OmegaConf 2.2.

@Jasha10
Copy link
Collaborator Author

Jasha10 commented Sep 29, 2021

It would be good to issue a better error than Can't pickle typing.Union[int, NoneType]: it's not the same object as typing.Union on Python 3.6 when we run into this problem.

Currently I can't reproduce that pickling error on the master branch (element_type==Optional[int] causes an AssertionError before the DictConfig instance can be full initialized).

Here is a minimal repro for this branch:

import pickle
from dataclasses import dataclass, field
from typing import Dict, Optional
from omegaconf import OmegaConf

@dataclass
class DictOptInt:
    dict_opt: Dict[str, Optional[int]]

cfg = OmegaConf.structured(DictOptInt)
pickle.dump(cfg, open("tmp.pkl", "wb"))  # causes error in python 3.6

I think catching the pickling error would require adding logic to BaseContainer.__getstate__ to handle the case where python_version<=3.7 and a Union type appears in self._metadata.
I'll follow up once the problem can be reproduced on master.

Edit: or should I add that logic to this PR?

@odelalleau
Copy link
Collaborator

Edit: or should I add that logic to this PR?

Future PR sounds better to me, can you just make sure there is an xfail test for Python 3.6 in this one? (if not already the case)

@odelalleau
Copy link
Collaborator

Merge branch 'master' into closes579

Meta-discussion: I personally prefer to rebase on top of master and force-push (doing it only if needed) => this allows to easily see all changes (commits) done on top of master, without having obscure merge commits that may hide how some conflicts were handled.

=> This is only me though and I wonder if there's a good argument in favor of merging master like you did

@odelalleau odelalleau self-requested a review September 29, 2021 12:41
Copy link
Collaborator

@odelalleau odelalleau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be good for me once everyone else's happy (including CI)

@Jasha10
Copy link
Collaborator Author

Jasha10 commented Sep 29, 2021

Merge branch 'master' into closes579

Meta-discussion: I personally prefer to rebase on top of master and force-push (doing it only if needed) => this allows to easily see all changes (commits) done on top of master, without having obscure merge commits that may hide how some conflicts were handled.

Good point about hiding how merge conflicts were handled.

=> This is only me though and I wonder if there's a good argument in favor of merging master like you did

My main motivation was to prevent earlier commits from becoming orphaned (so that review comments linking to specific commits would not be out-of-date).
I was also worried that the review comments themselves would become orphaned (i.e. that they would no longer show up in GitHub's "Files changed" tab). This seems not to be the case, however: based on my experiment in a test repo, GitHub is smart enough to re-apply the comments in the correct place on the rebased branch. I'll be rebasing in the future :)

@omry
Copy link
Owner

omry commented Sep 30, 2021

Meta-discussion: I personally prefer to rebase on top of master and force-push (doing it only if needed) => this allows to easily see all changes (commits) done on top of master, without having obscure merge commits that may hide how some conflicts were handled.

Not 100% sure we are talking about the same thing, but I prefer not to force push to master because it breaks anyone that has a checkout (or is keeping a git ref id stored).
You can reorganize things on your own branch (squash, or splitting into multiple commits) and force push and finally do a regular push when merging the PR.

@odelalleau
Copy link
Collaborator

GitHub is smart enough to re-apply the comments in the correct place on the rebased branch. I'll be rebasing in the future :)

Yeah it's been working fine for me so far, I haven't noticed any major issue. The main problem I've found is that comments referencing specific commits would now refer to the old version of these commits. It's often not a real problem because the diff is usually the same after rebase, but it's something to be aware of (especially if you have many unresolved discussions pointing to various commits).

Not 100% sure we are talking about the same thing, but I prefer not to force push to master because it breaks anyone that has a checkout (or is keeping a git ref id stored). You can reorganize things on your own branch (squash, or splitting into multiple commits) and force push and finally do a regular push when merging the PR.

We're not talking about the same thing :) What I was talking about is when you want to bring in some changes from master into your PR branch (typically to solve a conflict, or to fix a broken test). You can either merge master into your branch, or rebase your branch on top of master (and I was saying that I personally favor the second approach).

@Jasha10 Jasha10 marked this pull request as ready for review October 1, 2021 19:03
@Jasha10 Jasha10 marked this pull request as draft October 1, 2021 21:17
@Jasha10
Copy link
Collaborator Author

Jasha10 commented Nov 2, 2021

Rebased against master to incorporate changes from #800 (fixing the CI).

@Jasha10 Jasha10 marked this pull request as ready for review November 2, 2021 07:46
@Jasha10
Copy link
Collaborator Author

Jasha10 commented Nov 8, 2021

@odelalleau would you be willing to take a quick look at the recent commit b593f4f? Outside of that commit, there have been no significant changes since your previous review (besides a git rebase).

@odelalleau
Copy link
Collaborator

@odelalleau would you be willing to take a quick look at the recent commit b593f4f? Outside of that commit, there have been no significant changes since your previous review (besides a git rebase).

Yeah, sorry I said nothing but I had a look and it seems good to me. Since you're asking though, I remember @omry's been a bit reluctant in the past to add new exception subclasses unless there's a clear need, but that's ok as far as I'm concerned :)

@Jasha10
Copy link
Collaborator Author

Jasha10 commented Nov 8, 2021

Ok, thanks! I'll mention it to Omry before landing this.

@Jasha10
Copy link
Collaborator Author

Jasha10 commented Nov 15, 2021

After exchanging emails with Omry, I'm changing back to OmegaConfBaseException instead of introducing the new ConfigSerializationError type, as currently the use case for ConfigSerializationError is very rare (it only applies to python3.6 + pickle + optional element type).

@Jasha10 Jasha10 merged commit 55ce964 into omry:master Nov 15, 2021
@Jasha10 Jasha10 deleted the closes579 branch November 15, 2021 15:06
@Jasha10 Jasha10 changed the title Improve Optional[type] validation Improve Optional[type] support Nov 17, 2021
Jasha10 added a commit to Jasha10/omegaconf that referenced this pull request Dec 12, 2021
Jasha10 added a commit to Jasha10/omegaconf that referenced this pull request Dec 12, 2021
Jasha10 added a commit to Jasha10/omegaconf that referenced this pull request Dec 12, 2021
@Jasha10 Jasha10 mentioned this pull request Dec 13, 2021
Jasha10 added a commit that referenced this pull request Dec 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

When validating the assignment to a list, None is not properly validated Dict[str, Optional[str]] fails
3 participants