Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading Best Model from File #41

Closed
SivamPillai opened this issue Jul 30, 2019 · 10 comments
Closed

Loading Best Model from File #41

SivamPillai opened this issue Jul 30, 2019 · 10 comments

Comments

@SivamPillai
Copy link

The documentation clearly explains the procedure for loading the best model after hypereparameter optimization is complete.

models = tuner.get_best_models(num_models=2)

Also the metrics/ predictions can be obtained with:
# Evaluate the best model. loss, accuracy = best_model.evaluate(x_val, y_val)

However, how do you load a pre-tuned model from file and how to get the best model to make predictions?

@SivamPillai
Copy link
Author

My bad, I should have gone through the source code. For those who may stumble upon this, the way I found is to use the reload function. However, this still requires initialization of the tuner as it was done during the tuning process:

So for example:

Tuning:

from kerastuner.tuners import RandomSearch

tuner = RandomSearch(
    build_model,
    objective='val_accuracy',
    max_trials=5,
    executions_per_trial=3,
    directory='my_dir',
    project_name='helloworld')

tuner.search(x, y,
             epochs=5,
             validation_data=(val_x, val_y))

Loading the best model in a new script:

from kerastuner.tuners import RandomSearch

tuner = RandomSearch(
    build_model,
    objective='val_accuracy',
    max_trials=5,
    executions_per_trial=3,
    directory='my_dir',
    project_name='helloworld')

tuner.reload()
tuner.get_best_models(num_models=2)

Correct me if there are other/ better ways of doing this!

@ppurwar
Copy link

ppurwar commented Aug 2, 2019

I also did the same to access completed search but there should be another way.
This forces you access the code for building the model and reinstantiate tuner every time.

@ppurwar
Copy link

ppurwar commented Aug 2, 2019

I was able to comeup with a workaround. Use the attached code to import results in another script
`

import os
import json
import kerastuner.engine.hyperparameters as hp_module
import kerastuner.engine.trial as trial_module
import kerastuner.engine.metrics_tracking as metrics_tracking
from kerastuner.abstractions.tensorflow import TENSORFLOW_UTILS as tf_utils
import tensorflow as tf
from tensorflow.keras.models import model_from_json

class SearchResults(object):
    def __init__(self, directory, project_name, objective):
        self.directory = directory
        self.project_name = project_name
        self.objective = objective
        
    def reload(self):
        """Populate `self.trials` and `self.oracle` state."""
        fname = os.path.join(self.directory, self.project_name, 'tuner.json')
        state_data = tf_utils.read_file(fname)
        state = json.loads(state_data)

        self.hyperparameters = hp_module.HyperParameters.from_config(
            state['hyperparameters'])
        self.best_metrics = metrics_tracking.MetricsTracker.from_config(
            state['best_metrics'])
        self.trials = [trial_module.Trial.load(f) for f in state['trials']]
        self.start_time = state['start_time']
    
    def _get_best_trials(self, num_trials=1):
        if not self.best_metrics.exists(self.objective):
            return []
        trials = []
        for x in self.trials:
            if x.score is not None:
                trials.append(x)
        if not trials:
            return []
        direction = self.best_metrics.directions[self.objective]
        sorted_trials = sorted(trials,
                               key=lambda x: x.score,
                               reverse=direction == 'max')
        return sorted_trials[:num_trials]
    
    def get_best_models(self, num_models = 1):
        best_trials = self._get_best_trials(num_models)
        models = []
        for trial in best_trials:
            hp = trial.hyperparameters.copy()
            # Get best execution.
            direction = self.best_metrics.directions[self.objective]
            executions = sorted(
                trial.executions,
                key=lambda x: x.per_epoch_metrics.get_best_value(
                    self.objective),
                reverse=direction == 'max')
            
            # Reload best checkpoint.
            ckpt = executions[0].best_checkpoint
            model_graph = ckpt + '-config.json'
            model_wts = ckpt + '-weights.h5'
            with open(model_graph, 'r') as f:
                model = model_from_json(f.read())
            model.load_weights(model_wts)
            models.append(model)
        return models

`

example usage:

    res = SearchResults(directory='./multiclass_classifier/training',
                    project_name='search_bs2000',
                    objective='val_accuracy'                    
                   )
    res.reload()
    model = res.get_best_models()[0]

@SivamPillai
Copy link
Author

That is great. Looks much more elegant. Will check it out!

@franchesoni
Copy link

That works and is indeed useful. Thank you!

@omalleyt12
Copy link
Contributor

Generally what I'd advise for inference is to grab and persist the best hyperparameters from the Oracle, and then pass these hyperparameters to your build_model function and train the Model from scratch on the full dataset. This, for instance, is how AutoKeras makes use of this repo

For example:

hps = tuner.oracle.get_best_trials(num_trials=1)[0].hyperparameters
model - build_model(hps)
model.fit(...)

@schmidt-jake
Copy link

Is this the desired behavior? Why can't tuner.get_best_models() simply return trained models to be used in inference/serving?

@omalleyt12
Copy link
Contributor

@JakeTheWise it can and does

But usually when doing hyperparameter tuning, you'll split the data into three sets: train, validation, and test

You'll perform the hyperparameter search using the train set to train the model, and the validation set to evaluate hyperparameter performance

Then you evaluate the generalization ability on the test set with either:

  1. The best model found during the hyperparameter search as-is (as you suggested)
  2. Retraining using the train + validation data with the hyperparameters found during the search

get_best_models does (1)

Since more data is almost always better, (2) is likely to give you better performance on the test set (and in production), but requires additional training time

The idea of get_best_models is just to be a convenient way to access the models that were trained during the search

@schmidt-jake
Copy link

Understood!

@nithishsriramoju
Copy link

I was able to comeup with a workaround. Use the attached code to import results in another script `

import os
import json
import kerastuner.engine.hyperparameters as hp_module
import kerastuner.engine.trial as trial_module
import kerastuner.engine.metrics_tracking as metrics_tracking
from kerastuner.abstractions.tensorflow import TENSORFLOW_UTILS as tf_utils
import tensorflow as tf
from tensorflow.keras.models import model_from_json

class SearchResults(object):
    def __init__(self, directory, project_name, objective):
        self.directory = directory
        self.project_name = project_name
        self.objective = objective
        
    def reload(self):
        """Populate `self.trials` and `self.oracle` state."""
        fname = os.path.join(self.directory, self.project_name, 'tuner.json')
        state_data = tf_utils.read_file(fname)
        state = json.loads(state_data)

        self.hyperparameters = hp_module.HyperParameters.from_config(
            state['hyperparameters'])
        self.best_metrics = metrics_tracking.MetricsTracker.from_config(
            state['best_metrics'])
        self.trials = [trial_module.Trial.load(f) for f in state['trials']]
        self.start_time = state['start_time']
    
    def _get_best_trials(self, num_trials=1):
        if not self.best_metrics.exists(self.objective):
            return []
        trials = []
        for x in self.trials:
            if x.score is not None:
                trials.append(x)
        if not trials:
            return []
        direction = self.best_metrics.directions[self.objective]
        sorted_trials = sorted(trials,
                               key=lambda x: x.score,
                               reverse=direction == 'max')
        return sorted_trials[:num_trials]
    
    def get_best_models(self, num_models = 1):
        best_trials = self._get_best_trials(num_models)
        models = []
        for trial in best_trials:
            hp = trial.hyperparameters.copy()
            # Get best execution.
            direction = self.best_metrics.directions[self.objective]
            executions = sorted(
                trial.executions,
                key=lambda x: x.per_epoch_metrics.get_best_value(
                    self.objective),
                reverse=direction == 'max')
            
            # Reload best checkpoint.
            ckpt = executions[0].best_checkpoint
            model_graph = ckpt + '-config.json'
            model_wts = ckpt + '-weights.h5'
            with open(model_graph, 'r') as f:
                model = model_from_json(f.read())
            model.load_weights(model_wts)
            models.append(model)
        return models

`

example usage:

    res = SearchResults(directory='./multiclass_classifier/training',
                    project_name='search_bs2000',
                    objective='val_accuracy'                    
                   )
    res.reload()
    model = res.get_best_models()[0]

I need some help in extracting the best model before completing hyperparameter optimization. (using available trail files in the project directory)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants