Skip to content

Commit

Permalink
added test for training when both train and valid are subsets of a si… (
Browse files Browse the repository at this point in the history
#759)

* added test for training when both train and valid are subsets of a single lgb.Dataset object

* pep8 changes

* more pep8

* added test involving subsets of subsets of lgb.Dataset objects

* minor fix to contruction of X matrix

* even more pep8

* simplified test further
  • Loading branch information
j-m-hou authored and guolinke committed Aug 18, 2017
1 parent 64e5209 commit e7c5327
Showing 1 changed file with 15 additions and 0 deletions.
15 changes: 15 additions & 0 deletions tests/python_package_test/test_engine.py
Original file line number Diff line number Diff line change
Expand Up @@ -453,3 +453,18 @@ def test_pandas_categorical(self):
np.testing.assert_almost_equal(pred0, pred2)
np.testing.assert_almost_equal(pred0, pred3)
np.testing.assert_almost_equal(pred0, pred4)

def test_subset_train_val(self):
'''
Tests that it's fine to construct a single lgb.Dataframe object,
takes subsets of it, and uses the subsets for training and validation
'''
n = 1000
X = np.random.normal(size=(n, 2))
y = np.random.normal(size=n)
tmp_dat = lgb.Dataset(X, y)
# take subsets and train
tmp_dat_train = tmp_dat.subset(np.arange(int(n * .8)))
tmp_dat_val = tmp_dat.subset(np.arange(int(n * .8), n)).subset(np.arange(n * .2 * .9))
params = {'objective': 'regression_l2', 'metric': 'rmse'}
gbm = lgb.train(params, tmp_dat_train, num_boost_round=20, valid_sets=[tmp_dat_train, tmp_dat_val])

0 comments on commit e7c5327

Please sign in to comment.