Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

numpy.core._exceptions._ArrayMemoryError: during CatBoostRegressor Training #2405

Closed
Karrvp opened this issue Jun 7, 2023 · 6 comments
Closed
Labels

Comments

@Karrvp
Copy link

Karrvp commented Jun 7, 2023

Problem: model.fit fails with numpy.core._exceptions._ArrayMemoryError: Unable to allocate 71.6 GiB for an array with shape (98000, 98000) and data type float64
catboost version: 1.2
Operating System:ubuntu

Partial Code:

import numpy as np
import catboost as cb
from sklearn.model_selection import GridSearchCV

grid = {"learning_rate": [0.01,0.03],
"depth": [4,6],
"iterations": [10],
}

cbr = cb.CatBoostRegressor(loss_function='RMSE',eval_metric="RMSE",boosting_type ='Plain')#,task_type='GPU')
gscv = GridSearchCV(estimator = cbr, param_grid = grid)#, cv = 3)#, n_jobs=-1)
gscv.fit(X_train, y_train, cat_features = categorical_features_indices)

Error:

0: learn: 1.5782847 total: 207ms remaining: 1.86s
1: learn: 1.5758104 total: 378ms remaining: 1.51s
2: learn: 1.5733945 total: 550ms remaining: 1.28s
3: learn: 1.5710246 total: 678ms remaining: 1.02s
4: learn: 1.5686950 total: 805ms remaining: 805ms
5: learn: 1.5663753 total: 932ms remaining: 622ms
6: learn: 1.5641121 total: 1.06s remaining: 453ms
7: learn: 1.5618219 total: 1.18s remaining: 296ms
8: learn: 1.5596248 total: 1.31s remaining: 146ms
9: learn: 1.5576024 total: 1.44s remaining: 0us
/opt/conda/anaconda/lib/python3.7/site-packages/sklearn/model_selection/_validation.py:774: UserWarning: Scoring failed. The score on this train-test partition for these parameters will be set to nan. Details:
Traceback (most recent call last):
File "/opt/conda/anaconda/lib/python3.7/site-packages/sklearn/model_selection/_validation.py", line 761, in _score
scores = scorer(estimator, X_test, y_test)
File "/opt/conda/anaconda/lib/python3.7/site-packages/sklearn/metrics/_scorer.py", line 418, in _passthrough_scorer
return estimator.score(*args, **kwargs)
File "/opt/conda/anaconda/lib/python3.7/site-packages/catboost/core.py", line 5856, in score
residual_sum_of_squares = np.sum((y - predictions) ** 2)
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 71.6 GiB for an array with shape (98000, 98000) and data type float64
error

@Karrvp
Copy link
Author

Karrvp commented Jun 7, 2023

Hi @nikitxskv Thanks a lot for your reply.

I need some help in changing the numpy to FeaturesData class. most of the columns in my input dataset are categorical. Attached a sample data for reference. Approximately 3 million rows * 30 columns dataset
sample data.csv

categorical_features_indices = np.where(np.isin(X_train[X_train.columns].dtypes, ['bool', 'object']))[0]
work_time_sec is the value to be predicted.

col = "work_time_sec"
X = df.loc[:, underperform.columns != col]
y = df.loc[:, underperform.columns == col]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
cbr = cb.CatBoostRegressor(loss_function='RMSE',eval_metric="RMSE",boosting_type ='Plain')#,task_type='GPU')
gscv = GridSearchCV(estimator = cbr, param_grid = grid)#, cv = 3)#, n_jobs=-1)
gscv.fit(X_train, y_train, cat_features = categorical_features_indices)

@nikitxskv
Copy link
Collaborator

Hi @Karrvp !
I understood that it was something else.
I'll answer you as soon as I understand what's going on!

@nikitxskv
Copy link
Collaborator

Hi @Karrvp !
Can you try reshape y_train?
I suppose that you have y_train.shape == (N, 1)
You need to make y_train = y_train.reshape((N,)) and try again run gscv.fit

P.S. We will fix this bug in the next release!

@Karrvp
Copy link
Author

Karrvp commented Jun 12, 2023

Hi @nikitxskv,

After implementing the change, I am getting a different warning followed by kernel termination. Re-execution also fails.

/opt/conda/anaconda/lib/python3.7/site-packages/joblib/externals/loky/process_executor.py:706: UserWarning: A worker stopped while some jobs were given to the executor. This can be caused by a too short worker timeout or by a memory leak.
"timeout or by a memory leak.", UserWarning

image

image

@nikitxskv
Copy link
Collaborator

Try to do y_train = y_train.to_numpy().reshape((N,))
not y_train = y_train.to_numpy().reshape((N,1))

@nikitxskv
Copy link
Collaborator

By the way, here is fix: 733e63a

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants