Skip to content

Commit

Permalink
fixed picklability of sklearn models with custom obj and updated docs…
Browse files Browse the repository at this point in the history
…tings for custom obj
  • Loading branch information
StrikerRUS committed May 25, 2019
1 parent bb89244 commit bf21bdc
Show file tree
Hide file tree
Showing 4 changed files with 231 additions and 92 deletions.
63 changes: 55 additions & 8 deletions python-package/lightgbm/basic.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ def is_numeric(obj):


def is_numpy_1d_array(data):
"""Check whether data is a 1-D numpy array."""
"""Check whether data is a numpy 1-D array."""
return isinstance(data, np.ndarray) and len(data.shape) == 1


Expand All @@ -69,7 +69,7 @@ def is_1d_list(data):


def list_to_1d_numpy(data, dtype=np.float32, name='list'):
"""Convert data to 1-D numpy array."""
"""Convert data to numpy 1-D array."""
if is_numpy_1d_array(data):
if data.dtype == dtype:
return data
Expand Down Expand Up @@ -1853,9 +1853,20 @@ def update(self, train_set=None, fobj=None):
If None, last training data is used.
fobj : callable or None, optional (default=None)
Customized objective function.
Should accept two parameters: preds, train_data,
and return (grad, hess).
preds : list or numpy 1-D array
The predicted values.
train_data : Dataset
The training dataset.
grad : list or numpy 1-D array
The value of the first order derivative (gradient) for each sample point.
hess : list or numpy 1-D array
The value of the second order derivative (Hessian) for each sample point.
For multi-class task, the score is group by class_id first, then group by row_id.
If you want to get i-th row score in j-th class, the access way is score[j * num_data + i]
For multi-class task, the preds is group by class_id first, then group by row_id.
If you want to get i-th row preds in j-th class, the access way is score[j * num_data + i]
and you should group grad and hess in this way as well.
Returns
Expand Down Expand Up @@ -1902,9 +1913,9 @@ def __boost(self, grad, hess):
Parameters
----------
grad : 1-D numpy array or 1-D list
grad : list or numpy 1-D array
The first order derivative (gradient).
hess : 1-D numpy array or 1-D list
hess : list or numpy 1-D array
The second order derivative (Hessian).
Returns
Expand Down Expand Up @@ -1994,8 +2005,20 @@ def eval(self, data, name, feval=None):
Name of the data.
feval : callable or None, optional (default=None)
Customized evaluation function.
Should accept two parameters: preds, train_data,
Should accept two parameters: preds, eval_data,
and return (eval_name, eval_result, is_higher_better) or list of such tuples.
preds : list or numpy 1-D array
The predicted values.
eval_data : Dataset
The evaluation dataset.
eval_name : string
The name of evaluation function.
eval_result : float
The eval result.
is_higher_better : bool
Is eval result higher better, e.g. AUC is ``is_higher_better``.
For multi-class task, the preds is group by class_id first, then group by row_id.
If you want to get i-th row preds in j-th class, the access way is preds[j * num_data + i].
Expand Down Expand Up @@ -2030,6 +2053,18 @@ def eval_train(self, feval=None):
Customized evaluation function.
Should accept two parameters: preds, train_data,
and return (eval_name, eval_result, is_higher_better) or list of such tuples.
preds : list or numpy 1-D array
The predicted values.
train_data : Dataset
The training dataset.
eval_name : string
The name of evaluation function.
eval_result : float
The eval result.
is_higher_better : bool
Is eval result higher better, e.g. AUC is ``is_higher_better``.
For multi-class task, the preds is group by class_id first, then group by row_id.
If you want to get i-th row preds in j-th class, the access way is preds[j * num_data + i].
Expand All @@ -2047,8 +2082,20 @@ def eval_valid(self, feval=None):
----------
feval : callable or None, optional (default=None)
Customized evaluation function.
Should accept two parameters: preds, train_data,
Should accept two parameters: preds, valid_data,
and return (eval_name, eval_result, is_higher_better) or list of such tuples.
preds : list or numpy 1-D array
The predicted values.
valid_data : Dataset
The validation dataset.
eval_name : string
The name of evaluation function.
eval_result : float
The eval result.
is_higher_better : bool
Is eval result higher better, e.g. AUC is ``is_higher_better``.
For multi-class task, the preds is group by class_id first, then group by row_id.
If you want to get i-th row preds in j-th class, the access way is preds[j * num_data + i].
Expand Down
58 changes: 57 additions & 1 deletion python-package/lightgbm/engine.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,38 @@ def train(params, train_set, num_boost_round=100,
Names of ``valid_sets``.
fobj : callable or None, optional (default=None)
Customized objective function.
Should accept two parameters: preds, train_data,
and return (grad, hess).
preds : list or numpy 1-D array
The predicted values.
train_data : Dataset
The training dataset.
grad : list or numpy 1-D array
The value of the first order derivative (gradient) for each sample point.
hess : list or numpy 1-D array
The value of the second order derivative (Hessian) for each sample point.
For multi-class task, the preds is group by class_id first, then group by row_id.
If you want to get i-th row preds in j-th class, the access way is score[j * num_data + i]
and you should group grad and hess in this way as well.
feval : callable or None, optional (default=None)
Customized evaluation function.
Should accept two parameters: preds, train_data,
and return (eval_name, eval_result, is_higher_better) or list of such tuples.
preds : list or numpy 1-D array
The predicted values.
train_data : Dataset
The training dataset.
eval_name : string
The name of evaluation function.
eval_result : float
The eval result.
is_higher_better : bool
Is eval result higher better, e.g. AUC is ``is_higher_better``.
For multi-class task, the preds is group by class_id first, then group by row_id.
If you want to get i-th row preds in j-th class, the access way is preds[j * num_data + i].
To ignore the default metric corresponding to the used objective,
Expand Down Expand Up @@ -373,11 +401,39 @@ def cv(params, train_set, num_boost_round=100,
Evaluation metrics to be monitored while CV.
If not None, the metric in ``params`` will be overridden.
fobj : callable or None, optional (default=None)
Custom objective function.
Customized objective function.
Should accept two parameters: preds, train_data,
and return (grad, hess).
preds : list or numpy 1-D array
The predicted values.
train_data : Dataset
The training dataset.
grad : list or numpy 1-D array
The value of the first order derivative (gradient) for each sample point.
hess : list or numpy 1-D array
The value of the second order derivative (Hessian) for each sample point.
For multi-class task, the preds is group by class_id first, then group by row_id.
If you want to get i-th row preds in j-th class, the access way is score[j * num_data + i]
and you should group grad and hess in this way as well.
feval : callable or None, optional (default=None)
Customized evaluation function.
Should accept two parameters: preds, train_data,
and return (eval_name, eval_result, is_higher_better) or list of such tuples.
preds : list or numpy 1-D array
The predicted values.
train_data : Dataset
The training dataset.
eval_name : string
The name of evaluation function.
eval_result : float
The eval result.
is_higher_better : bool
Is eval result higher better, e.g. AUC is ``is_higher_better``.
For multi-class task, the preds is group by class_id first, then group by row_id.
If you want to get i-th row preds in j-th class, the access way is preds[j * num_data + i].
To ignore the default metric corresponding to the used objective,
Expand Down

0 comments on commit bf21bdc

Please sign in to comment.