fix(gbdt): correct dtrain assignment in finetune() to use Dataset instead of tuple #2049
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
lgb.Dataset()itself does not return None or an empty object, but the data inside it can be empty and cause errors in subsequent training. So we need to determine whether the data inside it is empty or not.Using the
num_data()method, you can get the number of sample rows (the number of data strips) to determine if the internal data is empty.Calling the
num_data()method directly will result in an error, so you need to call theconstruct()method before calling thenum_data()method.After calling
construct()method, it will release the raw data, which will cause an error when executinglgb.train(), so add the parameterfree_raw_data=Falseinlgb.Dataset()method.Reference documentation for
lgb.Dataset(): https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.Dataset.html#lightgbm-datasetMotivation and Context
How Has This Been Tested?
pytest qlib/tests/test_all_pipeline.pyunder upper directory ofqlib.Screenshots of Test Results (if appropriate):
Types of changes