-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] @dataclass __post_init__
for config validation
#377
Comments
Thanks, copying my response from there here for better context: Thanks for the thoughtful feature request @Jasha10.
The use cases you are describing will be addressed by #131. |
Given that we now have a method to convert to native dataclasses, I think this is no longer needed and I suggest we close this. |
Sounds good to me, closing now :)
For people watching this issue or discovering it later: The idea is to use the @dataclass
class MySQLConfig:
host: str = "localhost"
port: int = 3306
def __post_init__(self):
assert 7000 < self.port < 99999
cfg = OmegaConf.structured(MySQLConfig) # create an omegaconf.DictConfig object
cfg2 = OmegaConf.to_object(cfg) # convert the DictConfig into an instance of MySQLConfig
|
FTR, you can also trigger cfg = OmegaConf.structured(MySQLConfig()) # pass the config instance instead of class |
To be clear, this is triggering _instance = MySQLConfig() # __post_init__ is run here
cfg: DictConfig = OmegaConf.structured(_instance) # this does not run __post_init__ |
As per this comment, I am filing this issue against the OmegaConf repository after originally filing here against the Hydra repository.
🚀 Feature Request
The
__post_init__
hook called by the__init__
function defined indataclasses
can be used for post-processing or validation of configuration information.Motivation
One possible use case for the
__post_init__
hook defined in PEP 557 -- Data Classes is the validation of configuration data. Here is an example:When running this file, we get an assertion error because the port number is outside the allowable range:
However, when the
MySQLConfig
class is used in a@hydra.main
app, the validation code from__post_init__
does not get run:No
AssertionError
was triggered. This may be unexpected behavior for users who are familiar with the normal behavior of the@dataclass
decorator.Pitch
Describe the solution you'd like
In the type signature for the
my_app
function above, the type hint for thecfg
argument isMySQLConfig
, but at runtime the type ofcfg
isomegaconf.dictconfig.DictConfig
. What if there were the option to have hydra return a bona fide instance ofMySQLConfig
, so thatassert isinstance(cfg, MySQLConfig)
is true within the body of themy_app
function? If the option were available to work with instances of the user-defined dataclass, rather than instances ofDictConfig
, clients could make use of methods defined on their custom config dataclass, including but not limited to the__post_init__
method that is automatically called upon dataclass initialization.Describe alternatives you've considered
One alternative is to use keyword expansion to convert the parsed
DictConfig
object into aMySQLConfig
object:This does not play well with nested config structures, e.g. if
MySQLConfig
has a field that is another (different) dataclass.Another alternative is to employ the dacite library to convert the
DictConfig
object into an instance ofMySQLConfig
: after callingimport dacite
andimport omegaconf
, the following definition ofmy_app
can be used to achieve the desired effect:The
MySQLConfig.__init__
andMySQLConfig.__post_init__
functions get called whendacite.from_dict
creates theMySQLConfig
instance. To be clear, this alternative involves a three-step procedure: (1) use hydra to parse theDictConfig
object, (2) use omegaconf'sto_container
to get a native python container, and (3) usedacite
to parse the python container as aMySQLConfig
instance. This alternative does work with nested dataclass structures; it is the solution that I am using currently.Of course, there are other packages besides hydra for parsing configuration/settings directly into a dataclass, e.g. argparse-dataclasses or the SimpleParsing package. Would be nice if hydra supported this feature though.
Are you willing to open a pull request?
I think some brainstorming would be required regarding the API for this feature. How would the user indicate whether they want a
DictConfig
or aMySQLConfig
object? Perhaps a keyword argument could be exposed by the@hydra.main
decorator or by thehydra.core.config_store.ConfigStore
class? There are some edge cases to consider too, such as how to handle cases of mixed config (e.g. yaml config mixed with dataclasses).At this point the
hydra
project seems fairly tightly coupled to theomegaconf
project. I'm not sure how easy it would be to implement this feature or where to start. Thedacite
solution mentioned above continues to be useful. Nevertheless, I am creating this issue to share the idea with others...The text was updated successfully, but these errors were encountered: