Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Trial Validation Split In AutoML Loop #295

Merged
merged 33 commits into from Mar 30, 2020

Conversation

alexbrillant
Copy link
Contributor

@alexbrillant alexbrillant commented Mar 27, 2020

What it is

My pull request does: Add a trial split to the trial data container.

How it works

I coded it this way:

class Trial:
    """
    Trial data container for :class:`AutoML`.
    A Trial contains the results for each validation split.
    Each trial split contains both the training set results, and the validation set results.
    .. seealso::
        :class:`AutoML`,
        :class:`TrialSplit`,
        :class:`HyperparamsRepository`,
        :class:`BaseHyperparameterSelectionStrategy`,
        :class:`RandomSearchHyperparameterSelectionStrategy`,
        :class:`DataContainer`
    """

    def __init__(
            self,
            hyperparams: HyperparameterSamples,
            status: 'TRIAL_STATUS' = None,
            pipeline: BaseStep = None,
            validation_splits=None,
            cache_folder=None,
            error=None,
            error_traceback=None,
            start_time=None,
            end_time=None
    ):
        if status is None:
            status = TRIAL_STATUS.PLANNED
        if validation_splits is None:
            validation_splits = []

        self.status: TRIAL_STATUS = status
        self.hyperparams = hyperparams
        self.pipeline: BaseStep = pipeline
        self.validation_splits = validation_splits
        self.cache_folder = cache_folder
        self.error_traceback = error_traceback
        self.error = error
        self.start_time = start_time
        self.end_time = end_time

Example usage

Here is how it is used in the AutoML loop class:

 for trial_number in range(self.n_trial):
            self.print_func('\ntrial {}/{}'.format(trial_number + 1, self.n_trial))
            auto_ml_data = self._load_auto_ml_data(trial_number)

            with self.hyperparams_repository.new_trial(auto_ml_data) as repo_trial:

                for training_data_container, validation_data_container in validation_splits:
                    p = copy.deepcopy(self.pipeline)
                    p.update_hyperparams(repo_trial.hyperparams)
                    repo_trial.set_hyperparams(p.get_hyperparams())

                    with repo_trial.new_validation_split(p) as repo_trial_split:
                        try:
                            repo_trial_split = self.trainer.fit(
                                trial=repo_trial_split,
                                train_data_container=training_data_container,
                                validation_data_container=validation_data_container,
                                context=context
                            )

                            repo_trial_split.set_success()

                            self.print_func('trial {}/{} split {}/{} score: {}'.format(
                                trial_number + 1,
                                self.n_trial,
                                repo_trial_split.split_number + 1,
                                len(training_data_container),
                                repo_trial_split.get_validation_score()
                            ))

                        except Exception as error:
                            track = traceback.format_exc()
                            self.print_func(track)
                            repo_trial_split.set_failed(error)

                    repo_trial.update_final_trial_status()

            self.hyperparams_repository.save_trial(repo_trial)

        best_hyperparams = self.hyperparams_repository.get_best_hyperparams(

@cla-bot cla-bot bot added the cla-signed label Mar 27, 2020
@alexbrillant alexbrillant changed the title Add Trial Repo Split Add Trial Validation Split In AutoML Loop Mar 27, 2020
neuraxle/metaopt/trial.py Outdated Show resolved Hide resolved
neuraxle/metaopt/trial.py Outdated Show resolved Hide resolved
neuraxle/metaopt/auto_ml.py Outdated Show resolved Hide resolved
neuraxle/metaopt/auto_ml.py Outdated Show resolved Hide resolved
neuraxle/metaopt/auto_ml.py Outdated Show resolved Hide resolved
neuraxle/metaopt/trial.py Outdated Show resolved Hide resolved
neuraxle/metaopt/trial.py Outdated Show resolved Hide resolved
neuraxle/metaopt/trial.py Outdated Show resolved Hide resolved
neuraxle/metaopt/trial.py Outdated Show resolved Hide resolved
neuraxle/metaopt/trial.py Show resolved Hide resolved
neuraxle/metaopt/auto_ml.py Outdated Show resolved Hide resolved
neuraxle/metaopt/trial.py Outdated Show resolved Hide resolved
neuraxle/metaopt/auto_ml.py Outdated Show resolved Hide resolved

time_a = time.time()
auto_ml = AutoML(
pipeline,
refit_trial=False,
refit_trial=True,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've already made the comment in the bast, but it'd be good to provide some example comment for the time printed so that readers can realize how faster it is. Optional fix.

BaseValidationSplitter.__init__(self)
self.k_fold = k_fold

def split(self, data_inputs, expected_outputs=None) -> Tuple[List, List, List, List]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe def __call__ instead? This would allow for passing functions as well as splitters. Idk.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried to do this, and failed. I think it's fine like this tbh

@alexbrillant alexbrillant merged commit 46b7f97 into Neuraxio:master Mar 30, 2020
@alexbrillant alexbrillant deleted the trial-repo-split branch March 30, 2020 01:19
alexbrillant added a commit to alexbrillant/Neuraxle that referenced this pull request Sep 3, 2020
Add Trial Validation Split In AutoML Loop
alexbrillant added a commit to alexbrillant/Neuraxle that referenced this pull request Sep 3, 2020
Add Trial Validation Split In AutoML Loop
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants