add ml_model_settings parameter #434

mafrahm · 2024-05-21T10:52:12Z

This PR adds the 'ml_model_settings' parameter that writes key-value pairs directly into the self.parameters attribute.

Example:

class MyModel(MLModel):
    def __init__(
        self,
        *args,
        **kwargs,
    ):
        # we cannot cast to dict on command line, but one can do this by hand
        ml_process_weights = self.parameters.get("ml_process_weights", {"st": 1, "tt": 2})
        if isinstance(ml_process_weight, tuple):
            ml_process_weights = {proc: int(weight) for proc, weight in [s.split(":") for s in ml_process_weights]}

        # store parameters of interest in the ml_model_inst, e.g. via the parameters attribute
        self.parameters = {
            "batchsize": int(self.parameters.get("batchsize", 1024)),
            "layers": tuple(int(layer) for layer in self.parameters.get("layers, (64, 64, 64)),
            "ml_process_weights": ml_process_weights
        }

        # create representation of ml_model_inst
        self.parameters_repr = law.util.create_hash(sorted(self.parameters.items()))

These parameters can then be changed on command line

law run cf.MLTraining --version v1 --ml-model MyModel --ml-model-settings "batchsize=2048,layers=32;32;32,ml_process_weights=st:1;tt:4"

Open questions:

1.) At the moment, the ml_model_settings is not implemented for the MLModelsMixin, therefore we can only use the "default" model (or create new model via derive) when e.g. creating histograms. We could add a ml_models_settings parameter to this mixin, but I'm not sure if this would be used much.
2.) Since the output of the model is now hashed, we might want to also automatically produce an output of the parameters or the parameters_repr such that ml trainings can always be reproduced if necessary.

…unction

pkausw

Nice PR, thanks! Just some minor questions.

I agree that having a mechanism that dumps the current parameter settings automatically would be nice. However, I think the user can do this themselves at the end of e.g. the training loop. We might want to think about providing e.g. a decorator that can be used directly for this end and that wraps e.g. the training function of the model inst. But imho, this isn't critical for this PR. (Though it also shouldn't be super hard, so if you have some spare time feel free to add it ;) )

columnflow/ml/__init__.py

columnflow/tasks/framework/mixins.py

pkausw

LGTM, thanks!

columnflow/tasks/framework/mixins.py

columnflow/ml/__init__.py

add ml_model_settings parameter to pass kwargs to the ml model init f…

51f8356

…unction

mafrahm added enhancement New feature or request ml ML pipeline related things labels May 21, 2024

mafrahm self-assigned this May 21, 2024

mafrahm force-pushed the feature/MLModel_settings branch from a2aa1e2 to b9e1b8d Compare May 21, 2024 13:10

pass ml_model_settings directly to the self.parameters

4a94a78

mafrahm force-pushed the feature/MLModel_settings branch from b9e1b8d to 4a94a78 Compare May 21, 2024 13:56

mafrahm added 2 commits May 24, 2024 09:09

use ml_model __str__ in output paths

6d38840

skip hash in ml model output path when parameters attribute is empty

1378d57

pkausw requested changes May 27, 2024

View reviewed changes

columnflow/ml/__init__.py Show resolved Hide resolved

columnflow/tasks/framework/mixins.py Show resolved Hide resolved

add sorting of parameter pairs

76df8e7

pkausw approved these changes May 28, 2024

View reviewed changes

columnflow/tasks/framework/mixins.py Show resolved Hide resolved

columnflow/ml/__init__.py Show resolved Hide resolved

pkausw merged commit 3253c97 into master May 28, 2024
8 checks passed

pkausw deleted the feature/MLModel_settings branch May 28, 2024 16:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add ml_model_settings parameter #434

add ml_model_settings parameter #434

mafrahm commented May 21, 2024 •

edited

pkausw left a comment

pkausw left a comment

add ml_model_settings parameter #434

add ml_model_settings parameter #434

Conversation

mafrahm commented May 21, 2024 • edited

pkausw left a comment

Choose a reason for hiding this comment

pkausw left a comment

Choose a reason for hiding this comment

mafrahm commented May 21, 2024 •

edited