-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial implementation of the end-to-end autotrain module #1219
Changes from 41 commits
26a9663
ff115c8
d8543f4
7419fc8
aeba080
fa28c82
ee130f7
cc81e27
03ba456
1da9aae
79e66e2
aa5e174
efcca6e
bfa0794
725f688
32fa44b
a887f04
df96d21
cfda49f
45a9af3
dc995b9
eb116e0
336b17e
e783e60
f93289a
85e7b56
5bb9312
01aa523
cb2b171
1767247
72cb96a
ba1d952
aa79c17
49a334e
59190b5
5da3865
74715c9
d23d1de
1eeddd5
9d663e8
cf922e6
56170bb
e08e758
4926ff1
e16b3a7
415acd2
faae740
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,111 @@ | ||
""" | ||
automl.py | ||
|
||
Driver script which: | ||
|
||
(1) Builds a base config by performing type inference and populating config | ||
w/default combiner parameters, training paramers, and hyperopt search space | ||
(2) Tunes config based on resource constraints | ||
(3) Runs hyperparameter optimization experiment | ||
""" | ||
import logging | ||
import sys | ||
from typing import Dict, Union | ||
|
||
import numpy as np | ||
import pandas as pd | ||
from ludwig.automl.base_config import create_default_config | ||
from ludwig.hyperopt.run import hyperopt | ||
|
||
logger = logging.getLogger(__name__) | ||
|
||
|
||
try: | ||
import dask.dataframe as dd | ||
import ray | ||
except ImportError: | ||
logger.error( | ||
' ray is not installed. ' | ||
'In order to use auto_train please run ' | ||
'pip install ludwig[ray]' | ||
) | ||
sys.exit(-1) | ||
|
||
OUTPUT_DIR = "." | ||
|
||
|
||
def model_select(default_configs): | ||
""" | ||
Performs model selection based on dataset. | ||
Note: Current implementation returns tabnet by default. This will be | ||
improved in subsequent iterations | ||
""" | ||
return default_configs['tabnet'] | ||
|
||
|
||
def auto_train( | ||
tgaddair marked this conversation as resolved.
Show resolved
Hide resolved
|
||
dataset: Union[str, pd.DataFrame, dd.core.DataFrame], | ||
target: str, | ||
time_limit_s: Union[int, float], | ||
output_dir: str = OUTPUT_DIR, | ||
config=None, | ||
): | ||
""" | ||
Main auto train API that first builds configs for each model type | ||
(e.g. concat, tabnet, transformer). Then selects model based on dataset | ||
attributes. And finally runs a hyperparameter optimization experiment. | ||
|
||
All batch and learning rate tuning is done @ training time. | ||
|
||
# Inputs | ||
:param dataset: (str) filepath to dataset. | ||
:param target_name: (str) name of target feature | ||
:param time_limit_s: (int, float) total time allocated to auto_train. acts | ||
as the stopping parameter | ||
|
||
# Returns | ||
:return: (str) path to best trained model | ||
""" | ||
if config is None: | ||
config = _create_auto_config(dataset, target, time_limit_s) | ||
model_name = config['combiner']['type'] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. COMBINER and TYPE should be constants |
||
hyperopt_results = _train(config, dataset, | ||
output_dir, model_name=model_name) | ||
experiment_analysis = hyperopt_results.experiment_analysis | ||
# catch edge case where metric_score is nan | ||
# TODO (ASN): Decide how we want to proceed if at least one trial has | ||
# completed | ||
for trial in hyperopt_results.ordered_trials: | ||
tgaddair marked this conversation as resolved.
Show resolved
Hide resolved
|
||
if np.isnan(trial.metric_score): | ||
raise ValueError( | ||
"There was an error running the experiment. " | ||
"A trial failed to start. " | ||
"Consider increasing the time budget for experiment. " | ||
Comment on lines
+78
to
+80
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are we sure failing to start is the only possible reason for a NaN? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @w4nderlust Not sure - let me investigate There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another way around it is to just check if the |
||
) | ||
|
||
autotrain_results = { | ||
'path_to_best_model': experiment_analysis.best_checkpoint, | ||
'trial_id': "_".join(experiment_analysis.best_logdir.split("/")[-1].split("_")[1:]) | ||
} | ||
return autotrain_results | ||
|
||
|
||
def _create_auto_config(dataset, target, time_limit_s) -> dict: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's make this public by removing the underscore. But There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @tgaddair Totally agree with making There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The idea would be if the user wants to inspect the auto config and modify it before training, e.g.:
Does that seem reasonable to you? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right right. this make total sense! |
||
default_configs = create_default_config(dataset, target, time_limit_s) | ||
model_config = model_select(default_configs) | ||
return model_config | ||
|
||
|
||
def _train( | ||
config: Dict, | ||
dataset: Union[str, pd.DataFrame, dd.core.DataFrame], | ||
output_dir: str, | ||
model_name: str | ||
): | ||
hyperopt_results = hyperopt( | ||
config, | ||
dataset=dataset, | ||
output_directory=output_dir, | ||
model_name=model_name | ||
) | ||
return hyperopt_results |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we do this in a few other places in Ludwig, but for programatic usage, we should probably avoid calling
sys.exit
in case the user doesn't want their notebook to crash. Maybe raise an exception?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point