Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Early Stopping for AutoML #241

Merged
merged 51 commits into from
Dec 12, 2019
Merged
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
8fcdee0
First pass without default value
jeremyliweishih Nov 25, 2019
41c5378
CL
jeremyliweishih Nov 25, 2019
f1520f9
lint
jeremyliweishih Nov 25, 2019
4170ea9
use pd mean
jeremyliweishih Nov 25, 2019
06f97ad
move check for early stopping
jeremyliweishih Nov 25, 2019
49cdb52
one call to mean
jeremyliweishih Nov 25, 2019
ce35c10
lint
jeremyliweishih Nov 25, 2019
131c0e5
seperate out
jeremyliweishih Nov 27, 2019
ecb7154
Make tests more consistent
jeremyliweishih Dec 2, 2019
3aa4a6d
Cleanup
jeremyliweishih Dec 2, 2019
e00406e
Add max_time test
jeremyliweishih Dec 2, 2019
ba3e3bd
Fix TQDM popping up twice
jeremyliweishih Dec 2, 2019
5998ad4
remove raise errors
jeremyliweishih Dec 2, 2019
1973d66
up time so circleci doesn't fail
jeremyliweishih Dec 2, 2019
6c9d3db
default to none for now and add check
jeremyliweishih Dec 2, 2019
3e6663e
Remove max_time for autoclassifier
jeremyliweishih Dec 2, 2019
6cf3f12
cleanup
jeremyliweishih Dec 2, 2019
a9411c8
update docstring
jeremyliweishih Dec 4, 2019
3dcc4dc
Rename to patience
jeremyliweishih Dec 4, 2019
cb23096
Refactor, greater than is better and test
jeremyliweishih Dec 4, 2019
acff80e
Merge branch 'master' of https://github.com/FeatureLabs/evalml into e…
jeremyliweishih Dec 4, 2019
1a11a6c
fix merge
jeremyliweishih Dec 4, 2019
1dd2ce5
remove check against max_name_len
jeremyliweishih Dec 4, 2019
a5bf373
First pass at fixing logic after refactor
jeremyliweishih Dec 4, 2019
2ab3c0c
add return for priority conditions
jeremyliweishih Dec 4, 2019
64c8b5e
Add small example to docs
jeremyliweishih Dec 4, 2019
3b6f076
Merge branch 'master' into early_stopping
jeremyliweishih Dec 6, 2019
fd5144d
use best score and best id
jeremyliweishih Dec 9, 2019
f99b365
Merge branch 'master' of https://github.com/FeatureLabs/evalml into e…
jeremyliweishih Dec 9, 2019
e58acd2
Added tolerance
jeremyliweishih Dec 9, 2019
be30773
Merge branch 'early_stopping' of https://github.com/FeatureLabs/evalm…
jeremyliweishih Dec 9, 2019
0a44aa1
Add tolerance test for autoregressor
jeremyliweishih Dec 9, 2019
1aa3a97
Merge branch 'master' into early_stopping
jeremyliweishih Dec 9, 2019
0b2a5a5
Fix documentation of early stopping
jeremyliweishih Dec 9, 2019
19f101b
Rename cont and one do_iteration
jeremyliweishih Dec 9, 2019
0f100fd
Merge branch 'master' of https://github.com/FeatureLabs/evalml into e…
jeremyliweishih Dec 10, 2019
189b14e
Fix for merge
jeremyliweishih Dec 10, 2019
2590112
Refactor using new self.results
jeremyliweishih Dec 10, 2019
c349781
Fix none-case
jeremyliweishih Dec 10, 2019
35c6f6c
Merge branch 'master' into early_stopping
jeremyliweishih Dec 10, 2019
ae15446
Correct first best_score check and augment tests
jeremyliweishih Dec 10, 2019
338600b
make test consistent
jeremyliweishih Dec 10, 2019
9576fd6
cleanup
jeremyliweishih Dec 11, 2019
173f611
Make changelog more descriptive
jeremyliweishih Dec 11, 2019
75adca9
Use mock results for test
jeremyliweishih Dec 11, 2019
10507c9
cleanup
jeremyliweishih Dec 11, 2019
d638152
Showcase tolerance in autoclassifier test
jeremyliweishih Dec 11, 2019
95425f2
Remove X,y as not fitting anymore
jeremyliweishih Dec 11, 2019
ed8593d
Simplify first id
jeremyliweishih Dec 11, 2019
c1a60a9
Cleanup init
jeremyliweishih Dec 12, 2019
173cdea
Merge branch 'master' of https://github.com/FeatureLabs/evalml into e…
jeremyliweishih Dec 12, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/source/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ Changelog
---------
**Future Releases**
* Enhancements
* Add early stopping to AutoML :pr:`241`
jeremyliweishih marked this conversation as resolved.
Show resolved Hide resolved
* Added ROC and confusion matrix metrics and plot for classification problems and introduce PipelineSearchPlots class :pr:`242`
* Fixes
* Lower botocore requirement :pr:`235`
Expand Down
76 changes: 59 additions & 17 deletions evalml/models/auto_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,13 +23,14 @@ class AutoBase:
plot = PipelineSearchPlots

def __init__(self, problem_type, tuner, cv, objective, max_pipelines, max_time,
model_types, detect_label_leakage, start_iteration_callback,
patience, model_types, detect_label_leakage, start_iteration_callback,
add_result_callback, additional_objectives, random_state, verbose):
if tuner is None:
tuner = SKOptTuner
self.objective = get_objective(objective)
self.problem_type = problem_type
self.max_pipelines = max_pipelines
self.patience = patience
self.model_types = model_types
self.detect_label_leakage = detect_label_leakage
self.start_iteration_callback = start_iteration_callback
Expand Down Expand Up @@ -58,6 +59,11 @@ def __init__(self, problem_type, tuner, cv, objective, max_pipelines, max_time,
self.max_time = convert_to_seconds(max_time)
else:
raise TypeError("max_time must be a float, int, or string. Received a {}.".format(type(max_time)))

if self.patience:
if (not isinstance(self.patience, int)) or self.patience < 0:
raise ValueError("patience value must be a positive integer. Received {} instead".format(self.patience))

self.results = {}
self.trained_pipelines = {}
self.random_state = random_state
Expand Down Expand Up @@ -94,7 +100,6 @@ def fit(self, X, y, feature_types=None, raise_errors=False):

self
"""
# make everything pandas objects
if not isinstance(X, pd.DataFrame):
X = pd.DataFrame(X)

Expand Down Expand Up @@ -130,26 +135,60 @@ def fit(self, X, y, feature_types=None, raise_errors=False):
self.logger.log("WARNING: Possible label leakage: %s" % ", ".join(leaked))

if self.max_pipelines is None:
start = time.time()
pbar = tqdm(total=self.max_time, disable=not self.verbose, file=stdout, bar_format='{desc} | Elapsed:{elapsed}')
pbar._instances.clear()
while time.time() - start <= self.max_time:
self._do_iteration(X, y, pbar, raise_errors)
pbar.close()
else:
pbar = tqdm(range(self.max_pipelines), disable=not self.verbose, file=stdout, bar_format='{desc} {percentage:3.0f}%|{bar}| Elapsed:{elapsed}')
pbar._instances.clear()
start = time.time()
for n in pbar:
elapsed = time.time() - start
if self.max_time and elapsed > self.max_time:
pbar.close()
self.logger.log("\n\nMax time elapsed. Stopping search early.")
break
self._do_iteration(X, y, pbar, raise_errors)
pbar.close()

self.logger.log("\n✔ Optimization finished")

start = time.time()
self._do_iteration(X, y, pbar, raise_errors)
jeremyliweishih marked this conversation as resolved.
Show resolved Hide resolved
pbar.update(1)
while self._check_stopping_condition(start):
self._do_iteration(X, y, pbar, raise_errors)
pbar.update(1)

desc = "✔ Optimization finished"
jeremyliweishih marked this conversation as resolved.
Show resolved Hide resolved
desc = desc.ljust(self._MAX_NAME_LEN)
pbar.set_description_str(desc=desc, refresh=True)
pbar.close()

def _check_stopping_condition(self, start):
cont = True
jeremyliweishih marked this conversation as resolved.
Show resolved Hide resolved
msg = None

# check max_time and max_pipelines
elapsed = time.time() - start
if self.max_time and elapsed >= self.max_time:
jeremyliweishih marked this conversation as resolved.
Show resolved Hide resolved
cont = False
elif self.max_pipelines and len(self.results) >= self.max_pipelines:
cont = False

# check patience
curr_id = max(self.results, key=int)
if self.objective.greater_is_better:
best_id = max(self.results, key=lambda x: self.results[x]['score'])
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kmax12 what do you think of adding self._best_id instead of iterating for the best id/score each time? We could also reuse this for best_pipeline. Adding self._best_id would make getting the best score a constant check vs. a linear run time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd say that a premature optimization. even if there were 1000s of pipelines, a linear search doesn't take much time

else:
best_id = min(self.results, key=lambda x: self.results[x]['score'])

best_score = self.results[best_id]['score']
if self.patience is not None and curr_id >= best_id + self.patience:
ids_to_check = [i for i in range(best_id, best_id + self.patience + 1)]
scores_to_check = [self.results[id]['score'] for id in ids_to_check]
without_improvement = 0
for score in scores_to_check:
if self.objective.greater_is_better:
if score <= best_score:
without_improvement += 1
else:
if score >= best_score:
without_improvement += 1
if without_improvement >= self.patience:
cont = False
msg = "\n\n{} iterations without improvement. Stopping search early...".format(self.patience)
if not cont and msg:
self.logger.log(msg)
return cont

def check_multiclass(self, y):
if y.nunique() <= 2:
Expand Down Expand Up @@ -226,6 +265,9 @@ def _do_iteration(self, X, y, pbar, raise_errors):
if self.verbose: # To force new line between progress bar iterations
print('')

# return average CV score
return score
jeremyliweishih marked this conversation as resolved.
Show resolved Hide resolved

def _select_pipeline(self):
return random.choice(self.possible_pipelines)

Expand Down
5 changes: 5 additions & 0 deletions evalml/models/auto_classifier.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ def __init__(self,
multiclass=False,
max_pipelines=None,
max_time=None,
patience=None,
model_types=None,
cv=None,
tuner=None,
Expand All @@ -41,6 +42,9 @@ def __init__(self,
has elapsed. If it is an integer, then the time will be in seconds.
For strings, time can be specified as seconds, minutes, or hours.

patience (int): Number of iterations without improvement to stop search early. Must be positive.
If None, early stopping is disabled. Defaults to None.

model_types (list): The model types to search. By default searches over all
model_types. Run evalml.list_model_types("classification") to see options.

Expand Down Expand Up @@ -84,6 +88,7 @@ def __init__(self,
cv=cv,
max_pipelines=max_pipelines,
max_time=max_time,
patience=patience,
model_types=model_types,
problem_type=problem_type,
detect_label_leakage=detect_label_leakage,
Expand Down
5 changes: 5 additions & 0 deletions evalml/models/auto_regressor.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ def __init__(self,
objective=None,
max_pipelines=None,
max_time=None,
patience=None,
model_types=None,
cv=None,
tuner=None,
Expand All @@ -39,6 +40,9 @@ def __init__(self,
model_types (list): The model types to search. By default searches over all
model_types. Run evalml.list_model_types("regression") to see options.

patience (int): Number of iterations without improvement to stop search early. Must be positive.
If None, early stopping is disabled. Defaults to None.

cv: cross validation method to use. By default StratifiedKFold

tuner: the tuner class to use. Defaults to scikit-optimize tuner
Expand Down Expand Up @@ -74,6 +78,7 @@ def __init__(self,
cv=cv,
max_pipelines=max_pipelines,
max_time=max_time,
patience=patience,
model_types=model_types,
problem_type=problem_type,
detect_label_leakage=detect_label_leakage,
Expand Down
12 changes: 12 additions & 0 deletions evalml/tests/automl_tests/test_autoclassifier.py
Original file line number Diff line number Diff line change
Expand Up @@ -269,3 +269,15 @@ def test_max_time_units():

with pytest.raises(TypeError, match="max_time must be a float, int, or string. Received a <class 'tuple'>."):
AutoClassifier(objective='F1', max_time=(30, 'minutes'))


def test_early_stopping(capsys, X_y):
X, y = X_y

with pytest.raises(ValueError, match='patience value must be a positive integer.'):
clf = AutoClassifier(objective='AUC', max_pipelines=5, model_types=['linear_model'], patience=-1, random_state=0)

clf = AutoClassifier(objective='AUC', max_pipelines=5, model_types=['linear_model'], patience=1, random_state=0)
clf.fit(X, y)
out, _ = capsys.readouterr()
assert "1 iterations without improvement. Stopping search early." in out
21 changes: 21 additions & 0 deletions evalml/tests/automl_tests/test_autoregressor.py
Original file line number Diff line number Diff line change
Expand Up @@ -81,3 +81,24 @@ def add_result_callback(results, trained_pipeline, counts=counts):

assert counts["start_iteration_callback"] == max_pipelines
assert counts["add_result_callback"] == max_pipelines


def test_early_stopping(capsys, X_y):
X, y = X_y
clf = AutoRegressor(objective='r2', max_pipelines=5, patience=1, model_types=['linear_model'], random_state=0)
clf.fit(X, y)

out, _ = capsys.readouterr()
assert "1 iterations without improvement. Stopping search early." in out

clf = AutoRegressor(objective='r2', max_time='60 seconds', patience=1, model_types=['linear_model'], random_state=0)
clf.fit(X, y)

out, _ = capsys.readouterr()
assert "1 iterations without improvement. Stopping search early." in out

clf = AutoRegressor(objective='mse', max_time='60 seconds', patience=1, model_types=['linear_model'], random_state=0)
clf.fit(X, y)

out, _ = capsys.readouterr()
assert "1 iterations without improvement. Stopping search early." in out