Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add config / params dataclasses for high-level functions #679

Open
Tracked by #614
NickleDave opened this issue Jul 3, 2023 · 1 comment
Open
Tracked by #614

ENH: Add config / params dataclasses for high-level functions #679

NickleDave opened this issue Jul 3, 2023 · 1 comment
Labels
ENH: enhancement enhancement; new feature or request

Comments

@NickleDave
Copy link
Collaborator

High-level functions--prep, train, eval, predict, learncurve--now dispatch to lower level functions based on the task and/or dataset

Currently all parameters of the lower level functions must also be parameters of the higher level function, that then passes them in as args

This results in a combinatorial explosion of parameters for the high level function and makes it hard to know which parameters are for which lower level function

We should instead have the high level function accept a params argument that can be one of a set of dataclasses.

E.g., vak.train will accept train_params that can be vak.train.frame_classification.TrainParams or vak.train.dimensionality_reduction.TrainParams.

This will also make it easier to map from a config file to params classes if we use similar levels in the config file, e.g.

[vak.train.frame_classification]
pretrained_weights_path = '/dev/null/multiverse'
@NickleDave NickleDave changed the title ENG: Add Params dataclasses for high-level functions ENH: Add Params dataclasses for high-level functions Jul 18, 2023
@NickleDave NickleDave added the ENH: enhancement enhancement; new feature or request label Jul 18, 2023
@NickleDave NickleDave changed the title ENH: Add Params dataclasses for high-level functions ENH: Add config / params dataclasses for high-level functions Jan 22, 2024
@NickleDave
Copy link
Collaborator Author

NickleDave commented Jan 22, 2024

I have thought about a couple of ways to do this:

  1. Use introspection to literally make a dataclass directly from the training function and its type annotations -- this is the Params approach, since we make a dataclass whose attributes are literally the parameters of the function
  2. Write a high-level TrainConfig class and then subclass it. This potentially has all the usual issues with subclassing, with the caveat that these are just attributes and not methods proper so we seem less likely to bang up against those issues. Worst case we might just end up adding/removing attributes a lot.

Either way we should end up with the following:
high-level functions that just take a TrainConfig/TrainParams,
something like this

def train(
    config: TrainFrameClassificationConfig | TrainParametricUMAPConfig | TrainAvaConfig
)
    # validate config is an instance of one of those configs we type hint with, then
    train_kwargs = asdict(config)
    if isinstance(config, TrainFrameClassificationConfig):
        train_frame_classification_model(**train_kwargs)
    elseif ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ENH: enhancement enhancement; new feature or request
Projects
None yet
Development

No branches or pull requests

1 participant