-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Hello,
I am running auto-sklearn on a Google Cloud machine in Jupyter. I keep getting the following out of memory error no matter how much memory I assigned to ml_memory_limit
. The following is the error message I am getting:
ValueError: Dummy prediction failed with run state StatusType.MEMOUT and additional output: {'error': 'Memout (used more than 5000 MB).', 'configuration_origin': 'DUMMY'}.
The following is my initialization code:
import autosklearn.classification
automl = autosklearn.classification.AutoSklearnClassifier(
time_left_for_this_task=300,
per_run_time_limit=30,
ml_memory_limit=5000,
ensemble_size=0,
include_preprocessors=["no_preprocessing"])
automl.fit(X_train.values, y_train.index.values)
The X_train has 400K rows with 5 columns of data. The y_train is a vector with 400K rows of data. I am using auto-sklearn==0.10.0
. I have been adjusting the ml_memory_limit
beyond 5000 MB but the program returned pretty quickly with the same error. The ml_memory_limit
doesn't seem to be honored. I have tried the suggestions in issue#520 but to no avail.
I tried to run the following example in the Jupyter notebook to make sure I am using the library correctly:
import autosklearn.classification
import sklearn.datasets
import sklearn.metrics
X, y = sklearn.datasets.load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = \
sklearn.model_selection.train_test_split(X, y, random_state=1)
automl = autosklearn.classification.AutoSklearnClassifier(
time_left_for_this_task=120,
per_run_time_limit=30
)
automl.fit(X_train, y_train, dataset_name='breast_cancer')
It finished training successfully.
I would appreciate any help from the community!
Environment:
Python version: 3.7.8
Scikit-learn version: 0.22.2.post1
OS: Debian 9
auto-sklearn: 0.10.0