Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Easier **replacement** of subconfigs #2227

Open
ssnl opened this issue May 24, 2022 · 19 comments
Open

[Feature Request] Easier **replacement** of subconfigs #2227

ssnl opened this issue May 24, 2022 · 19 comments
Labels
enhancement Enhanvement request

Comments

@ssnl
Copy link

ssnl commented May 24, 2022

馃殌 Feature Request

Hydra is a great tool for merging configs. E.g., it can easily start with a default config

x: 3
y: 'me'
data:
  path: ???

and add user specifications

x: 2
z: 4
data:
  path: ~/data

to obtain

x: 2
y: 'me'
z: 4
data:
  path: ~/data

However, in many cases, especially when configuring a system made of components, we want to replace a part of a config. E.g., with a default config,

name: 'root_finder'
algorithm:
  _target_: alg.GradientBased
  optimizer: 
    _target_: optim.Adam
    lr: 0.3
    eps: 1e-5

we might want to change the algorithm.optimizer to a different choice that have a completely different set of options, such as

  optimizer: 
    _target_: optim.Newton
    num_iterations: 1000

A merging behavior will result in

  optimizer: 
    _target_: optim.Newton
    lr: 0.3
    eps: 1e-5
    num_iterations: 1000

which is not what we want. In other words, here the two optimizer choices are different configurable objects, rather than namespaces/subconfigs to be stitched together to form the big config file.

Now, in Hydra, AFAICT, the main mechanism of Hydra to perform this is via default list & overrides. The syntax can be a bit overly verbose. But it works. Here's the example:

# /config.yaml
defaults:
- optim@algorithm.optimizer: adam

name: 'root_finder'
algorithm:
  _target_: alg.GradientBased
  optimizer: ???

# /optim/adam.yaml
_target_: optim.Adam
lr: 0.3
eps: 1e-5

# /optim/newton.yaml
_target_: optim.Newton
num_iterations: 1000

and the user can use override with ~algorithm.optimizer optim@algorithm.optimizer=newton (note the necessity of using two flags).

Okay, this might be okay if your config is this simple. But what if it is not? Say we need to select twoalgorithms, and the default optimizer is different for them, and the algorithms also come with different types..... Then we need to do

# /config.yaml
defaults:
- algorithm@algorithm_one: gradient_based_default_for_alg_one
- algorithm@algorithm_two: gradient_based_default_for_alg_two

name: 'root_finder'
algorithm_one: ???
algorithm_two: ???

# /algorithm/gradient_based_default_for_alg_one.yaml
defaults:
- optim@optimizer: adam_default_for_alg_one
_target_: alg.GradientBased
name: 'grad_based'
optimizer: ???

# /algorithm/gradient_based_default_for_alg_two.yaml
defaults:
- optim@optimizer: adam_default_for_alg_two
_target_: alg.GradientBased
name: 'grad_based'
optimizer: ???

# /algorithm/cma_es.yaml
name: 'cma_es'

# /optim/newton.yaml
_target_: optim.Newton
num_iterations: 1000

# /optim/adam_default_for_alg_one.yaml
_target_: optim.Adam
lr: 0.3
eps: 1e-5

# /optim/adam_default_for_alg_two.yaml
_target_: optim.Adam
lr: 0.1
eps: 1e-1

# /optim/newton.yaml
_target_: optim.Newton
num_iterations: 1000

Hmmm, seriously? Throwing a default list whenever a subconfig is meant to be replaced in user specification (rather than merged) quickly becomes annoying, difficult to read, and hard to manage. Surely there should be an easier way to do this....

Implications for structure configs

For the same reason (I believe), it can be annoying to compose structured configs that use a subclass value as default for a superclass-annotated type. To make it concrete, consider

@dataclass
class Base:
  x: int = MISSING

@dataclass
class ImplA(Base):
  x: int = 4
  y: int = 10

@dataclass
class ImplB(Base):
  x: int = 100
  y: str = 'y'

@dataclass 
class Config:
  key: Base = ImplA()

cs.store(name='config', node=Config)

then, if we try to merge config with a user override/config that use ImplB for key, OmegaConf merging fails, because in the view of OmegaConf, Config's key field has type ImplA, because of its default ImplA value, despite the Base type annotation.

This can be also worked around with the same default list approach. But, as mentioned above, it quickly becomes unmanageable in slightly larger projects.

Am I missing anything? Please! I want to use Hydra so badly.... But this is such a pain so far...

@ssnl ssnl added the enhancement Enhanvement request label May 24, 2022
@Jasha10
Copy link
Collaborator

Jasha10 commented May 24, 2022

Thank you for the thoughtful note @ssnl.

and the user can use override with ~algorithm.optimizer optim@algorithm.optimizer=newton (note the necessity of using two flags).

I believe you're mistaken about the necessity of two flags; with Hydra 1.2, I'm getting the same result using just optim@algorithm.optimizer=newton at the command line.

@ssnl
Copy link
Author

ssnl commented May 24, 2022

I believe you're mistaken about the necessity of two flags; with Hydra 1.2, I'm getting the same result using just optim@algorithm.optimizer=newton at the command line.

@Jasha10 Ah yes you are right in this case! I was thinking about a pattern that, puts all default values in a separate group for easier management (e.g., so that optim group will only have the actual two different options adam and newton for users to compose with, but not default values for some config.) To make it clear, here's what I mean

# two optim options 
cs.store(group='optim', name='adam', node=r'''
_target_: optim.Adam
lr: 0.3
eps: 1e-5
''')

cs.store(group='optim', name='newton', node=r'''
_target_: optim.Newton
num_iterations: 1000
''')

# the main config, and..
cs.store(name='config', node=r'''
defaults:
- defaults/algorithm@algorithm.optimizer: optimizer

name: 'root_finder'
algorithm:
  _target_: alg.GradientBased
  optimizer: ???
''')

# its default value for the optimizer!
cs.store(group='defaults/algorithm', name='optimizer', node=r'''
_target_: optim.Adam
lr: 0.4
eps: 1e-4
''')

Then,

from hydra import compose, initialize
from omegaconf import OmegaConf

# context initialization
with initialize(version_base=None, ):
    cfg = compose(config_name="config", overrides=['+optim@algorithm.optimizer=newton'])
    print(OmegaConf.to_yaml(cfg))

gives

algorithm:
  optimizer:
    _target_: optim.Newton
    lr: 0.4
    eps: 0.0001
    num_iterations: 1000
  _target_: alg.GradientBased
name: root_finder

unless we use

    cfg = compose(config_name="config", overrides=['~defaults/algorithm@algorithm.optimizer', '+optim@algorithm.optimizer=newton'])

@ssnl
Copy link
Author

ssnl commented May 24, 2022

In any case.... is there a better way to do the issue described in main post than default list at the moment....?

@Jasha10
Copy link
Collaborator

Jasha10 commented May 24, 2022

I was thinking about a pattern that, puts all default values in a separate group for easier management

I see.

In any case.... is there a better way to do the issue described in main post than default list at the moment....?

Unfortunately no, not that I'm aware of.

@Jasha10
Copy link
Collaborator

Jasha10 commented May 25, 2022

It might be possible to get some traction with OmegaConf interpolations or custom resolvers, which are somewhat orthogonal to defaults lists.

I'm thinking something along the lines of

cfg = compose(config_name="config", overrides=['algorithm.optimizer=${optimizers.adam}'])

or using a custom resolver

cfg = compose(config_name="config", overrides=['algorithm.optimizer=${get_config:optimizers.adam}'])

where get_config would be a custom resolver defined to look up the given config somehow.

This could work because

@Jasha10
Copy link
Collaborator

Jasha10 commented May 25, 2022

or using a custom resolver

Here is a concrete example of this idea:

# optim/adam.yaml
_target_: optim.Adam
lr: 0.3
eps: 1e-5

# optim/newton.yaml
_target_: optim.Newton
num_iterations: 1000

# defaults/algorithm/optimizer.yaml
_target_: optim.Adam
lr: 0.4
eps: 1e-4

# config.yaml
defaults:
- defaults/algorithm@algorithm.optimizer: optimizer
- _self_
name: 'root_finder'
algorithm:
  _target_: alg.GradientBased
  optimizer: ???
# main.py
from hydra import compose, initialize
from omegaconf import OmegaConf

OmegaConf.register_new_resolver(
    "get_config", lambda name: OmegaConf.load(f"{name}.yaml")
)

with initialize(".", version_base=None):
    cfg = compose(
        config_name="config",
        overrides=['algorithm.optimizer=${get_config:optim/newton}'],
    )
    print(OmegaConf.to_yaml(cfg, resolve=True))
$ python main.py
algorithm:
  optimizer:
    _target_: optim.Newton
    num_iterations: 1000
  _target_: alg.GradientBased
name: root_finder

Currently this only works with yaml files on disk; to make this work with the ConfigStore, we'd need to modify the get_config resolver so that it hooks into Hydra's config-loading mechanism.

A downside of this approach is that, once you use an interpolation or resolver, you cannot easily "merge into" the interpolated value. For example, combining the overrides 'algorithm.optimizer=${get_config:optim/newton}' and 'algorithm.optimizer.num_iterations=200' does not work.

@Jasha10
Copy link
Collaborator

Jasha10 commented May 26, 2022

Zooming out, Hydra's config composition is currently merge-based. The override '+optim@algorithm.optimizer=newton' means "merge the optim/newton config into the algorithm.optimizer package". This issue is about how to "replace the algorithm.optimizer package with the optim/newton config".

In the future maybe we can make Hydra's defaults lists more flexible to allow this. For example, we could use some new syntax like ~= (e.g. '+optim@algorithm.optimizer~=newton') to mean replace/update instead of merge.

Here are my notes on how this could be implemented:
Currently, Hydra's defaults list workflow works in three steps:

  1. Compose a defaults tree (done by _create_defaults_tree_impl)
  2. Flatten the defaults tree into a list (done by _create_defaults_list)
  3. Merge all configs from the defaults list to create the output config (done in _compose_config_from_defaults_list)

Perhaps we can modify step 3 so that some defaults will result in a replace operation instead of a merge operation.

@greaber
Copy link

greaber commented May 26, 2022

By the way, I'm not sure if this is the place to ask this, but I've never fully grasped the concept of the defaults list and what it lets you do that you couldn't do if you just allowed values in config to be functional expressions that could evaluate to, say, the value at a particular key (in the same file or a different one) or the result of merging two values.

So you could do something like

name: 'root_finder'
algorithm:
  _target_: alg.GradientBased
  optimizer: @{optimizers.adam}

optimizers:
  adam:
    _target_: optim.Adam
    lr: 0.3
    eps: 1e-5
  newton:
    _target_: optim.Newton
    lr: 0.4
    eps: 0.0001
    num_iterations: 1000

Or, if you wanted to reuse these optimizer configs in different places, you might have an optimizers.yaml file, and then you would put the contents of the optimizers key there and then do:

name: 'root_finder'
algorithm:
  _target_: alg.GradientBased
  optimizer: @{optimizers/adam}

...

Or, if you needed to override the learning rate,

name: 'root_finder'
algorithm:
  _target_: alg.GradientBased
  optimizer: @{merge(_optimizers/adam, optimizer_overrides)}

_optimizer_overrides:
  lr: 0.1

...

Due to the limitations of yaml syntax, it probably makes sense to not allow nesting dict expressions inside "functional" expressions but instead require them to be assigned to a variable, like _optimizer_overrides. Also, I'm using the convention here that underscored config is for internal use, but there could be a way to make _optimizer_overrides actually not be in the output config (if that is not already possible).

@Jasha10
Copy link
Collaborator

Jasha10 commented May 26, 2022

I'm not sure if this is the place to ask this

@greaber would you mind opening a topic under Discussions? Thanks!

@ssnl
Copy link
Author

ssnl commented May 26, 2022

Due to the limitations of yaml syntax, it probably makes sense to not allow nesting dict expressions inside "functional" expressions but instead require them to be assigned to a variable, like _optimizer_overrides. Also, I'm using the convention here that underscored config is for internal use, but there could be a way to make _optimizer_overrides actually not be in the output config (if that is not already possible).

@greaber The reason I wanted to use hydra is to move away from a personal argparsing package that I've been using, to something more standard. IMO my package's syntax on this issue is quite nice, and I want to share here. It works like this:

name: 'root_finder'
algorithm:
  _target_: alg.GradientBased
  optimizer: 
    adam:  # this key specifies which optimizer. in python code there is an easy way to build the mapping: name -> class
       # overwrites begin here
       lr: 0.1 

@albertfgu
Copy link

+1 to this. It's something I commonly want in my ML workflow where interreplacable configs often have completely different keys.

@Penchekrak
Copy link

Penchekrak commented Aug 25, 2022

I've created this creepy workaround, however, I believe, that this is somewhat along the lines of how such behaviour may be incorporated in omegaconf in the less intrusive way. This, however forces you to use structured configs, since introducing special keywords in yaml cofnig is even more ugly and inconsistent with current hydra functionality to my taste (there are currently no keywords that alter config creation behaviour except for defaults list)

# omegaconf_patch.py

import copy
from typing import Union, Dict, Any, Optional, Type, _type_check, _SpecialForm

from omegaconf import DictKeyType, DictConfig, KeyValidationError, ValidationError, ReadonlyConfigError
from omegaconf._utils import _valid_dict_key_annotation_type, is_structured_config, is_structured_config_frozen, \
    get_type_of, format_and_raise, get_type_hint, is_dict_annotation, is_list_annotation, \
    is_tuple_annotation, _get_value, get_value_kind, ValueKind, _is_none, _resolve_optional
from omegaconf.base import Box, ContainerMetadata, Container, Node, UnionNode
from omegaconf.basecontainer import _update_types, _create_structured_with_missing_fields, BaseContainer

_OVERRIDABLE_CLASS_FIELD = '__overridable__'


def is_structured_config_overridable(obj):
    type_ = get_type_of(obj)
    return getattr(type_, _OVERRIDABLE_CLASS_FIELD, False)  # type: ignore


def __init__(
        self,
        content: Union[Dict[DictKeyType, Any], "DictConfig", Any],
        key: Any = None,
        parent: Optional[Box] = None,
        ref_type: Union[Any, Type[Any]] = Any,
        key_type: Union[Any, Type[Any]] = Any,
        element_type: Union[Any, Type[Any]] = Any,
        is_optional: bool = True,
        flags: Optional[Dict[str, bool]] = None,
) -> None:
    try:
        if isinstance(content, DictConfig):
            if flags is None:
                flags = content._metadata.flags
        super(DictConfig, self).__init__(
            parent=parent,
            metadata=ContainerMetadata(
                key=key,
                optional=is_optional,
                ref_type=ref_type,
                object_type=dict,
                key_type=key_type,
                element_type=element_type,
                flags=flags,
            ),
        )

        if not _valid_dict_key_annotation_type(key_type):
            raise KeyValidationError(f"Unsupported key type {key_type}")

        if is_structured_config(content) or is_structured_config(ref_type):
            ##### THIS IS NEW CODE #####
            if is_structured_config_overridable(content) or is_structured_config_overridable(
                    ref_type
            ):
                self._set_flag("overridable", True)  # allows overrides
                self._set_flag("struct", False)  # allows new keys to be provided in config
                self._metadata.ref_type = ref_type.__bases__[0]  # unwraps original ref_type
            ##### END OF NEW CODE #####
            self._set_value(content, flags=flags)
            if is_structured_config_frozen(content) or is_structured_config_frozen(
                    ref_type
            ):
                self._set_flag("readonly", True)

        else:
            if isinstance(content, DictConfig):
                metadata = copy.deepcopy(content._metadata)
                metadata.key = key
                metadata.ref_type = ref_type
                metadata.optional = is_optional
                metadata.element_type = element_type
                metadata.key_type = key_type
                self.__dict__["_metadata"] = metadata
            self._set_value(content, flags=flags)
    except Exception as ex:
        format_and_raise(node=None, key=key, value=None, cause=ex, msg=str(ex))


def _map_merge(dest: "BaseContainer", src: "BaseContainer") -> None:
    """merge src into dest and return a new copy, does not modified input"""
    from omegaconf import AnyNode, DictConfig, ValueNode

    assert isinstance(dest, DictConfig)
    assert isinstance(src, DictConfig)
    src_type = src._metadata.object_type
    src_ref_type = get_type_hint(src)
    assert src_ref_type is not None

    # If source DictConfig is:
    #  - None => set the destination DictConfig to None
    #  - an interpolation => set the destination DictConfig to be the same interpolation
    if src._is_none() or src._is_interpolation():
        dest._set_value(src._value())
        _update_types(node=dest, ref_type=src_ref_type, object_type=src_type)
        return

    dest._validate_merge(value=src)

    def expand(node: Container) -> None:
        rt = node._metadata.ref_type
        val: Any
        if rt is not Any:
            if is_dict_annotation(rt):
                val = {}
            elif is_list_annotation(rt) or is_tuple_annotation(rt):
                val = []
            else:
                val = rt
        elif isinstance(node, DictConfig):
            val = {}
        else:
            assert False

        node._set_value(val)

    if (
            src._is_missing()
            and not dest._is_missing()
            and is_structured_config(src_ref_type)
    ):
        # Replace `src` with a prototype of its corresponding structured config
        # whose fields are all missing (to avoid overwriting fields in `dest`).
        src = _create_structured_with_missing_fields(
            ref_type=src_ref_type, object_type=src_type
        )

    if (dest._is_interpolation() or dest._is_missing()) and not src._is_missing():
        expand(dest)

    src_items = list(src) if not src._is_missing() else []
    for key in src_items:
        src_node = src._get_node(key, validate_access=False)
        dest_node = dest._get_node(key, validate_access=False)
        assert isinstance(src_node, Node)
        assert dest_node is None or isinstance(dest_node, Node)
        src_value = _get_value(src_node)

        src_vk = get_value_kind(src_node)
        src_node_missing = src_vk is ValueKind.MANDATORY_MISSING
        
        ##### THIS IS NEW CODE #####
        if dest_node is not None and dest_node._get_node_flag('overridable'):
            flags = dest_node._metadata.flags
            dest[key] = DictConfig(  # this creates node from scratch to override previous value
                dest_node._metadata.ref_type,
                parent=dest,
                ref_type=dest_node._metadata.ref_type,
                is_optional=dest_node._metadata.optional,
                flags=dest_node._metadata.flags
            )
            dest_node = dest._get_node(key)
            dest_node._metadata.flags = flags  # I belive it's a bug in omegaconf that flags are discarded
                                               # when setting node to DictConfig using __setitem__ (at dest[key]) 
                                               # so I manually set those.
        ##### END OF NEW CODE #####

        if isinstance(dest_node, DictConfig):
            dest_node._validate_merge(value=src_node)

        if (
                isinstance(dest_node, Container)
                and dest_node._is_none()
                and not src_node_missing
                and not _is_none(src_node, resolve=True)
        ):
            expand(dest_node)

        if dest_node is not None and dest_node._is_interpolation():
            target_node = dest_node._maybe_dereference_node()
            if isinstance(target_node, Container):
                dest[key] = target_node
                dest_node = dest._get_node(key)

        is_optional, et = _resolve_optional(dest._metadata.element_type)
        if dest_node is None and is_structured_config(et) and not src_node_missing:
            # merging into a new node. Use element_type as a base
            dest[key] = DictConfig(
                et, parent=dest, ref_type=et, is_optional=is_optional
            )
            dest_node = dest._get_node(key)

        if dest_node is not None:
            if isinstance(dest_node, BaseContainer):
                if isinstance(src_node, BaseContainer):
                    dest_node._merge_with(src_node)
                elif not src_node_missing:
                    dest.__setitem__(key, src_node)
            else:
                if isinstance(src_node, BaseContainer):
                    dest.__setitem__(key, src_node)
                else:
                    assert isinstance(dest_node, (ValueNode, UnionNode))
                    assert isinstance(src_node, (ValueNode, UnionNode))
                    try:
                        if isinstance(dest_node, AnyNode):
                            if src_node_missing:
                                node = copy.copy(src_node)
                                # if src node is missing, use the value from the dest_node,
                                # but validate it against the type of the src node before assigment
                                node._set_value(dest_node._value())
                            else:
                                node = src_node
                            dest.__setitem__(key, node)
                        else:
                            if not src_node_missing:
                                dest_node._set_value(src_value)

                    except (ValidationError, ReadonlyConfigError) as e:
                        dest._format_and_raise(key=key, value=src_value, cause=e)
        else:
            from omegaconf import open_dict

            if is_structured_config(src_type):
                # verified to be compatible above in _validate_merge
                with open_dict(dest):
                    dest[key] = src._get_node(key)
            else:
                dest[key] = src._get_node(key)

    _update_types(node=dest, ref_type=src_ref_type, object_type=src_type)

    # explicit flags on the source config are replacing the flag values in the destination
    flags = src._metadata.flags
    assert flags is not None
    for flag, value in flags.items():
        if value is not None:
            dest._set_flag(flag, value)


DictConfig.__init__ = __init__
BaseContainer._map_merge = _map_merge


@_SpecialForm
def Overridable(self, type_: Type):
    type_ = _type_check(type_, f"{self} requires a single type.")
    return type(type_.__name__ + "Overridable", (type_,), {_OVERRIDABLE_CLASS_FIELD: True})

Importing this prior to the config composition results in

from omegaconf_patch import Overridable
from dataclasses import dataclass

@dataclass
class OptimizerConfig:
    _target_: str
    lr: float
    
@dataclass
class Config:
    optimizer: Overridable[OptimizerConfig]

This results in preserved typechecking for fields _target_ and lr and override of the whole optimizer field (discarding previous adhoc key-value pairs and placing new ones) for any nonempy entry of particular optimizer specification in yaml. Note however that this way it is not possible to define default values for OptimizerConfig from within Config, since each time merging encounters nonnull entry of config specification it recreates necessary DictConfig from scratch using only type information.

@omry
Copy link
Collaborator

omry commented Aug 25, 2022

I didn't read the whole thing, but as Jasha pointed out the config composition is a merge based process.
You can achieve replacement via the defaults list if you leave the node empty. e.g:

# config.yaml
defaults:
  - optimized: adam
 
optimizer: ???

This would populate adam by default, but allow you to replace it with anything else by overriding the defaults list (optimizer=nesterov).

@Penchekrak
Copy link

Penchekrak commented Aug 25, 2022

You can achieve replacement via the defaults list if you leave the node empty.

Yeah, sure, you can, and similar solution was proposed in original issue. However, as topicstarter pointed out, this quickly becomes very verbose and difficult to maintain as you have to separate parts of config that should be overriden and declare them using default list and other "mergeable" parts using regular config syntax. Take a look, hovewer, at ListConfig nodes. Those are always overriden and, internaly, when some new configuration is encountered, old nodes are discarded and prototype nodes are populated with new values. This is the mechanism I reimplemented for DictConfig nodes with special flag, and it enables easier replacement of config parts adhoc, without the need to use default list. However I really dislike that it relies on the internal api, so maybe @omry should take into consideration the mechanism of marking nodes to be overriden instead being merged as it is ffrequently requested feature :)

@Jasha10
Copy link
Collaborator

Jasha10 commented Aug 25, 2022

@Penchekrak if I understand correctly, your patch makes OmegaConf.merge behave differently depending on whether the "overridable" flag is set on the DictConfig instance.

My opinion is that this issue should be resolved at the level of Hydra rather than by adding a feature to OmegaConf, though your idea of customizing merge through use of flags is interesting.

@SZiesche
Copy link

I have a very similar problem and I don't know if that is a bug, or if I do something wrong. This only appears when the second layer of nesting is introduced in the following minimal example.

I tried to debug this and I see that the statement in the error message is true. But I don't get, why this should be a problem at all.

from dataclasses import dataclass, field
from typing import Any, List

import hydra.utils
from hydra.core.config_store import ConfigStore
from omegaconf import MISSING

# I have multiple datasets


@dataclass
class DataConfig:
    """This is just a common base class."""


@dataclass
class Dataset1Config(DataConfig):
    some_member1: int = 1


@dataclass
class Dataset2Config(DataConfig):
    some_member2: int = 2


# I register them at some place in my folder structure.
cs = ConfigStore.instance()
cs.store(group="some/data/folder", name=Dataset1Config.__name__, node=Dataset1Config)
cs.store(group="some/data/folder", name=Dataset2Config.__name__, node=Dataset2Config)


# I have multiple training routines (listed only one here deriving from a common base class)


@dataclass
class TrainingConfig:
    pass


@dataclass
class SpecialTrainingConfig:
    some_member4: int = 4


cs.store(group="some/training/folder", name=SpecialTrainingConfig.__name__, node=SpecialTrainingConfig)


# finally I have multiple models that can be trained differently. Model1 is usually trained by the special training, so
# I want to have this as a default.


@dataclass
class ModelConfig:
    """This is just a common base class."""


@dataclass
class Model1Config(ModelConfig):
    defaults: List[Any] = field(default_factory=lambda: [{"/some/training/folder@training": "SpecialTrainingConfig"}])

    training: TrainingConfig = MISSING


@dataclass
class Model2Config(ModelConfig):
    some_member: int = 3


# I register them at some place in my folder structure.
cs.store(group="some/model/folder", name=Model1Config.__name__, node=Model1Config)
cs.store(group="some/model/folder", name=Model2Config.__name__, node=Model2Config)


# main.py:
# for the main routine of my machine learning project, I also have a config.
# usually I would use dataset1, so I have this as a default. But the model changes often and I don't want to use a
# default there.
@dataclass
class ScriptConfig:
    defaults: List[Any] = field(default_factory=lambda: [{"some/data/folder@dataset": "Dataset1Config"}])

    dataset: DataConfig = MISSING
    model: ModelConfig = MISSING


cs.store(name="ScriptConfig", node=ScriptConfig)


@hydra.main(config_name="my_config2", version_base="1.2", config_path=".")
def main(cfg):
    print(cfg)
    # some more code follows ...


if __name__ == "__main__":
    main()

my_config2.yaml now looks like this:

defaults:
  - ScriptConfig
  - /some/model/folder@model: Model1Config

so I want to train model1. Running this, I end up with the error msg: In 'some/training/folder/SpecialTrainingConfig': ConfigKeyError raised while composing config: Key 'training' not in 'ModelConfig'

Is this expected? If yes, what would have to be done to change this?

@alexcoca
Copy link

@Penchekrak if I understand correctly, your patch makes OmegaConf.merge behave differently depending on whether the "overridable" flag is set on the DictConfig instance.

My opinion is that this issue should be resolved at the level of Hydra rather than by adding a feature to OmegaConf, though your idea of customizing merge through use of flags is interesting.

Has this gained any traction? :) Interesting discussion!

@ga92xug
Copy link

ga92xug commented Jan 16, 2024

Is there a way to tell Hydra from an experiment file (https://hydra.cc/docs/patterns/configuring_experiments/) to ignore all previous entries? I assume that we run into a similar problem with merge but maybe there is a keyword that I have not found like delete or disregard all other entries to scheduler.

defaults:
  - override /training_setup/scheduler: `cosine_annealing_lr

@odelalleau
Copy link
Collaborator

Is there a way to tell Hydra from an experiment file (https://hydra.cc/docs/patterns/configuring_experiments/) to ignore all previous entries?

Not sure exactly what you mean by "ignore all previous entries". If it is the same problem as the one that originally motivated this issue, then AFAIK there is still no really nice way to do it (but I agree it'd be a useful feature). If this is something else, maybe it'd be best to start a new issue or discussion?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhanvement request
Projects
None yet
Development

No branches or pull requests

10 participants