Major changes to interpolations and resolvers #445

odelalleau · 2020-11-26T21:09:16Z

This is a squashed version of #321.

Add a grammar to parse interpolations
Add support for nested interpolations
Deprecate register_resolver() and introduce register_new_resolver() (that allows resolvers to use non-string arguments, and to decide whether or not to use the cache)
The env resolver now parses environment variables in a way that is consistent with the new interpolation grammar

Fixes #100 #230 #266 #318

lgtm-com · 2020-11-26T21:16:21Z

This pull request introduces 1 alert when merging ddb57bb into 800e8a7 - view on LGTM.com

new alerts:

1 for Unused import

pyproject.toml

omegaconf/base.py

odelalleau · 2020-11-26T22:38:47Z

This pull request introduces 1 alert when merging ddb57bb into 800e8a7 - view on LGTM.com

new alerts:

1 for Unused import

Just a note that I fixed that in a force push (it was making flake8 tests fail as well) -- this was caused by my rebase on top of master

omry · 2020-11-28T18:58:28Z

looks like you have conflicts with master. can you fix?

omry

Alright, made a review pass.
lots of comments and questions inline. overall looks good.
(Must say this was easier to review now, in part because of the previous review passes in the other PR but also due to the fact that reviewing it as one unit is easier.

docs/notebook/Tutorial.ipynb

docs/source/usage.rst

omry · 2020-11-28T19:14:25Z

docs/source/usage.rst

+Environment variables are parsed when they are recognized as valid quantities that
+may be evaluated (e.g., int, float, dict, list):


Can you go over why we think (sometimes) parsing the environment variables as opposed to always passing them as is in a string is the the best choice?

Until yesterday I thought that otherwise there would no way to write something like:

learning_rate: ${env:LR}

and have learning_rate be a float.

That being said, I just realized that {Integer,Float,Boolean}Node actually convert from string, so maybe this would work with structured configs?
(although I'm not entirely sure why this is allowed in the first place, my first instinct would be that assigning a string to an int/float/bool node should raise an exception)

Well, you can check and see for yourself :).
I think the answer is no, because the result if the interpolation is returned directly and is not assigned to the config.

Converting on assignment was introduced really early and may have been a mistake.
The motivation is to support converting during merge: in particular merging from cli, where everything is a string.

We should consider deprecating support for assignments that mypy would frowns upon if it knew the type.

@dataclass class Foo: x: int = 10 cfg: Foo = OmegaConf.structured(Foo) cfg.x = "20" # mypy would bitch here even though it's legal in OmegaConf.

We should consider deprecating support for assignments that mypy would frowns upon if it knew the type.

I created #459 to keep track of this

Until yesterday I thought that otherwise there would no way to write something like:

learning_rate: ${env:LR}

and have learning_rate be a float.

Issue 488 is a recent report about this.
I think we should consider validating the return value of custom interpolations against the specified type.
This is a breaking change, so switching to new resolvers is a good opportunity (with a potential opt-out flag - maybe).
As a POC, I gave it a shot in #506 (based on the current interpolation logic). It adds validation for nodes that are annotated (ValueNodes only, and only if they are not AnyNode).

Just a note that this actually doesn't work, I'll look into fixing it. Just to be sure, do we want the following to work:

@dataclass class Cfg: s: str = "7" i: int = II("s")

?
(the question is, since s can be cast to int, is it ok to do it implicitly here as well?)

As discussed, I reverted the changes that were (attempting at) addressing #488. I will do that in a follow-up PR instead, to avoid adding too much stuff here (since it's already way too big).

As a result, I also believe that it's best to also revisit the behavior of the env resolver in another follow-up PR, so that when env is changed to only return strings, then at least typed nodes keep working as expected (even without a new decode resolver).

@omry if you're ok with that plan then this conversation may be resolved.

Just to clarify:
There are multiple things here that should probably be changed.

Support for type annotation for resolvers (with runtime validation).

Env resolver should no longer parse input environment variables.

Potentially new built in resolver oc.decode() that can parse an input with the grammar parser.

488 is about 1.
2 and 3 are related.

I agree, they will be addressed in that order in future PRs

docs/source/usage.rst

news/445.feature.2

omegaconf/listconfig.py

omegaconf/omegaconf.py

pyproject.toml

tests/test_interpolation.py

omry · 2020-11-28T20:40:40Z

tests/test_interpolation.py

+# same as the definition (or `OmegaConf.create()` called on it for lists
+# and dictionaries).
+# Order matters! (each entry should only depend on those above)
+TEST_CONFIG_DATA: List[Tuple[str, Any, Any]] = [


I was expecting to see some parser specific unit tests (testing the parser directly without going through the higher layers).
Did I miss it?

No, I didn't realize it was possible until later, at which point I had already written all tests this way.
Do you want me to refactor them?

I missed this one.
Yes - I think having lower level tests will make the tests more robust and easier to work with.
Once you have lower level tests, this entire function can probably be removed.

I refactored these tests in b520815

Essentially I split the big list into 3:

Stuff that can be parsed without interpolations (to test the basic constructs of the grammar, like int, float, str, dict, etc.). Those can be parsed without using any config object.

Expressions containing interpolations for the parser rule singleElement (this is the biggest chunk)

Finally, expressions for the highest-level parser rule configValue

In the end it still looks pretty similar overall, but at least it's faster and avoids using dummy resolvers everywhere just to test the correct parser rule. Let me know if that looks ok to you.

NB: in a first attempt I tried to go even lower in the rules (e.g., using the dictContainer parser rule for dicts), but the problem is that without an EOF in the parser rule, ANTLR fails to detect some errors (ex: {0: 1}} is not supposed to be a valid dict). That's why I stuck to the singleElement and configValue rules (that contain the EOF).

odelalleau · 2020-12-06T22:19:44Z

Regarding #448, currently it doesn't work because @ is not a valid character in IDs. I can easily fix it but I'd like to clarify a few things first:

@ should be only allowed in key names, not resolver names, correct?
@ should be allowed at any position in the key name, correct?
Is there any plan to support @ in keys in Hydra's override grammar? Currently this seems to conflict with the group@pkg syntax

omry · 2020-12-07T02:50:21Z

Regarding #448, currently it doesn't work because @ is not a valid character in IDs. I can easily fix it but I'd like to clarify a few things first:

@ should be only allowed in key names, not resolver names, correct?

@ should be allowed at any position in the key name, correct?

Is there any plan to support @ in keys in Hydra's override grammar? Currently this seems to conflict with the group@pkg syntax

This is an extension of group@pkg, allowing an empty package in Hydra.
It's not landed yet, but you can see it here.

group and group@pkg are now going to be supported as interpolation keys in the defaults list (this is non-standard interpolation support). to enable that OmegaConf need to accept @ as a legal character in config keys.
This can also be useful for things like emails. from the perspective of OmegaConf it's just another valid character.
Let me know if you have any more questions.

Yes
Yes
No conflict, this is the same, just allowing for an empty package.

odelalleau · 2020-12-07T03:47:31Z

Yes

Yes

No conflict, this is the same, just allowing for an empty package.

Ok thanks, I added this in 3af2f85 (edited: initially it was d959a13 but I realized I could make the grammar more readable) (also added another commit, af3870d, to add a missing test for @ character in unquoted strings, which was already supported)

Regarding Hydra, I tried synching facebookresearch/hydra#1170 with a config.yaml looking like:

foo:
  x@y: youpla

and run the command line

python my_app.py foo.x@y=boom

and Hydra complains with Override foo.x@y=boom looks like a config group override, but config group 'foo.x' does not exist.
From what you said I suspect this is behaving as expected, but I still thought I'd mention it just in case.

omry · 2020-12-07T17:41:55Z

Commented in one of the commits, this seems a bit more complicated than I would have wanted.

About Hydra:
Yes, this is the expected behavior. you can't override keys with @ in them from the command line because the syntax is reserved. if it becomes a real problem we can support escaping of the @.

#1170 is still work in progress, the new logic is not integrated there yet.
Interpolation in the defaults list will be special. You can see a bit about it here.

Here is a concrete test that has an interpolation with @ in the key.

odelalleau · 2020-12-07T17:58:08Z

Commented in one of the commits, this seems a bit more complicated than I would have wanted.

Link to the discussion for reference: d959a13#r44887061

Yes, this is the expected behavior. you can't override keys with @ in them from the command line because the syntax is reserved. if it becomes a real problem we can support escaping of the @.

Ok, just wanted to be sure!

Here is a concrete test that has an interpolation with @ in the key.

I see, thanks!

odelalleau · 2020-12-09T02:53:44Z

I did a force push to replace my old commit adding support for @ by another one (9e1bc41) that adds support for almost any character, as discussed in 200fef2#r44900426 (compared to this comment, I also added \ to the list of forbidden characters, in case we might want later to add some kind of escaping).

One open question is whether we really want to allow more varied dictionary keys: see discussion at 200fef2#commitcomment-44939891

omry

Recent batch of changes looks good. I still need to have a second pass on your responses to everything else.

omegaconf/_utils.py

omegaconf/grammar/OmegaConfGrammarParser.g4

omry

Partial pass. Got about a million lines left to review :)

docs/notebook/Tutorial.ipynb

docs/source/usage.rst

news/445.feature.2

omegaconf/_utils.py

omegaconf/base.py

…ctionary key

omegaconf/grammar_visitor.py

omry

Got most of it, but ran out of energy for now.

docs/source/usage.rst

omry · 2021-02-02T21:51:17Z

docs/source/usage.rst

+Environment variables are parsed when they are recognized as valid quantities that
+may be evaluated (e.g., int, float, dict, list):


Until yesterday I thought that otherwise there would no way to write something like:

learning_rate: ${env:LR}

and have learning_rate be a float.

Issue 488 is a recent report about this.
I think we should consider validating the return value of custom interpolations against the specified type.
This is a breaking change, so switching to new resolvers is a good opportunity (with a potential opt-out flag - maybe).
As a POC, I gave it a shot in #506 (based on the current interpolation logic). It adds validation for nodes that are annotated (ValueNodes only, and only if they are not AnyNode).

docs/source/usage.rst

omegaconf/base.py

omry · 2021-02-03T00:57:40Z

This is also closing #266, right?

This reverts commit 51c201f.

Will be re-enabled once the behavior of `register_new_resolver()` is finalized. Also renamed some variables for clarity and consistency.

…solver Fixes omry#230

omry · 2021-02-17T02:03:42Z

Your fixed string:

Fixes #100 #230 #266 #318

Only closed 100.
I think it works with a comma.

odelalleau · 2021-02-17T03:02:20Z

Your fixed string:
Fixes #100 #230 #266 #318
Only closed 100.
I think it works with a comma.

Oops, thanks for taking care of it! Looks like we actually need to repeat the "Fixes / Closes" keyword (https://stackoverflow.com/questions/3547445/closing-multiple-issues-in-github-with-a-commit-message)

odelalleau mentioned this pull request Nov 26, 2020

[OmegaConf 2.1] Grammar for parsing of interpolations #321

Closed

odelalleau commented Nov 26, 2020

View reviewed changes

pyproject.toml Show resolved Hide resolved

odelalleau commented Nov 26, 2020

View reviewed changes

omegaconf/base.py Outdated Show resolved Hide resolved

odelalleau commented Nov 26, 2020

View reviewed changes

omegaconf/base.py Outdated Show resolved Hide resolved

odelalleau mentioned this pull request Nov 26, 2020

Deprecation of register_resolver() to introduce new functionalities #426

Closed

odelalleau force-pushed the interpolation_grammar branch from ddb57bb to ee83314 Compare November 26, 2020 22:02

odelalleau force-pushed the interpolation_grammar branch from ee83314 to ad826e7 Compare November 28, 2020 20:08

omry requested changes Nov 28, 2020

View reviewed changes

odelalleau force-pushed the interpolation_grammar branch from 3ada218 to a8ff6f9 Compare November 29, 2020 15:21

odelalleau force-pushed the interpolation_grammar branch from d959a13 to 3af2f85 Compare December 7, 2020 04:44

odelalleau force-pushed the interpolation_grammar branch from 200fef2 to 9e1bc41 Compare December 9, 2020 02:49

omry reviewed Dec 13, 2020

View reviewed changes

omegaconf/_utils.py Show resolved Hide resolved

omegaconf/grammar/OmegaConfGrammarParser.g4 Show resolved Hide resolved

omry reviewed Dec 13, 2020

View reviewed changes

odelalleau mentioned this pull request Dec 14, 2020

Allow more types of dictionary keys in overrides grammar facebookresearch/hydra#1208

Merged

odelalleau referenced this pull request Dec 18, 2020

WIP - Allow most characters in node key names and any primitive as di…

200fef2

…ctionary key

odelalleau commented Dec 20, 2020

View reviewed changes

omegaconf/grammar_visitor.py Show resolved Hide resolved

omry reviewed Feb 2, 2021

View reviewed changes

This was referenced Feb 3, 2021

[Feature Request] Array Element Overrides from dotlist via getitem syntax #179

Closed

null as default value for env interpolation #230

Closed

omry mentioned this pull request Feb 3, 2021

Add env_str resolver built-in resolver #383

Closed

odelalleau added 20 commits February 11, 2021 18:48

Fix coverage

e78142a

Test to see if DeepCode is happier

7c0ab04

Revert "Test to see if DeepCode is happier"

977f2ee

This reverts commit 51c201f.

Rename new_register_resolver() -> register_new_resolver()

7bcd572

Disable deprecation warning for register_resolver()

7b47c52

Will be re-enabled once the behavior of `register_new_resolver()` is finalized. Also renamed some variables for clarity and consistency.

Fix a few minor issues after rebase on top of current master

262b05b

Fix coverage

7c53ed5

Add test & news for handling of null as default value to the env re…

9540b47

…solver Fixes omry#230

Use f-string instead of .format()

0d00b72

Run more tests with register_resolver()

5335ff5

Remove unused function in tests

0083a38

Safer handling of $ signs in error messages

3c45c9c

Detect invalid interpolation syntax on assignment

7d1907b

Move grammar tests to their own file

e7147b6

Cleaner implementation for interpolation syntax check on assignment

1cc2d27

Implement a cache for grammar objects

d19d078

Optimize the detection of basic interpolation patterns

bbe0d1f

Minor (no functional change)

dcf8952

Add test of escaped interpolations with simple regex matching

ed80d13

Minor news formatting improvement

5cb75f6

odelalleau force-pushed the interpolation_grammar branch from 5c03808 to 5cb75f6 Compare February 11, 2021 23:53

odelalleau added 2 commits February 11, 2021 19:13

Minor change to fix a DeepCode warning

3993195

Disable spurious DeepCode warning

d22e372

omry merged commit 499de6b into omry:master Feb 12, 2021

odelalleau deleted the interpolation_grammar branch February 12, 2021 03:23

odelalleau mentioned this pull request Feb 16, 2021

Improve handling of interpolations pointing to missing nodes #545

Merged

odelalleau mentioned this pull request Mar 13, 2021

Support optional _parent_ and _root_ parameters in custom resolvers #599

Merged

odelalleau mentioned this pull request May 14, 2021

Thread-safety concern related to interpolation parsing #715

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Major changes to interpolations and resolvers #445

Major changes to interpolations and resolvers #445

odelalleau commented Nov 26, 2020 •

edited

lgtm-com bot commented Nov 26, 2020

odelalleau commented Nov 26, 2020

omry commented Nov 28, 2020

omry left a comment

omry Nov 28, 2020

odelalleau Nov 29, 2020

omry Dec 13, 2020

odelalleau Dec 15, 2020

omry Feb 2, 2021

odelalleau Feb 6, 2021

odelalleau Feb 8, 2021

omry Feb 8, 2021

omry Feb 10, 2021

odelalleau Feb 10, 2021

omry Nov 28, 2020

odelalleau Nov 29, 2020

omry Feb 8, 2021

odelalleau Feb 9, 2021

odelalleau commented Dec 6, 2020

omry commented Dec 7, 2020 •

edited

odelalleau commented Dec 7, 2020 •

edited

omry commented Dec 7, 2020

odelalleau commented Dec 7, 2020

odelalleau commented Dec 9, 2020

omry left a comment

omry left a comment

omry left a comment

omry Feb 2, 2021

omry commented Feb 3, 2021

omry commented Feb 17, 2021

odelalleau commented Feb 17, 2021

		Environment variables are parsed when they are recognized as valid quantities that
		may be evaluated (e.g., int, float, dict, list):

Major changes to interpolations and resolvers #445

Major changes to interpolations and resolvers #445

Conversation

odelalleau commented Nov 26, 2020 • edited

lgtm-com bot commented Nov 26, 2020

odelalleau commented Nov 26, 2020

omry commented Nov 28, 2020

omry left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

odelalleau commented Dec 6, 2020

omry commented Dec 7, 2020 • edited

odelalleau commented Dec 7, 2020 • edited

omry commented Dec 7, 2020

odelalleau commented Dec 7, 2020

odelalleau commented Dec 9, 2020

omry left a comment

Choose a reason for hiding this comment

omry left a comment

Choose a reason for hiding this comment

omry left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

omry commented Feb 3, 2021

omry commented Feb 17, 2021

odelalleau commented Feb 17, 2021

odelalleau commented Nov 26, 2020 •

edited

omry commented Dec 7, 2020 •

edited

odelalleau commented Dec 7, 2020 •

edited