fix(gbdt): correct dtrain assignment in finetune() to use Dataset instead of tuple #2049

SunsetWolf · 2025-11-12T14:35:43Z

Description

lgb.Dataset() itself does not return None or an empty object, but the data inside it can be empty and cause errors in subsequent training. So we need to determine whether the data inside it is empty or not.

Using the num_data() method, you can get the number of sample rows (the number of data strips) to determine if the internal data is empty.

Calling the num_data() method directly will result in an error, so you need to call the construct() method before calling the num_data() method.

After calling construct() method, it will release the raw data, which will cause an error when executing lgb.train(), so add the parameter free_raw_data=False in lgb.Dataset() method.

Reference documentation for lgb.Dataset(): https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.Dataset.html#lightgbm-dataset

Motivation and Context

How Has This Been Tested?

Pass the test by running: pytest qlib/tests/test_all_pipeline.py under upper directory of qlib.
If you are adding a new feature, test on your own test scripts.

Screenshots of Test Results (if appropriate):

Pipeline test:
Your own tests:

Types of changes

Fix bugs
Add new feature
Update documentation

…tead of tuple

fix(gbdt): correct dtrain assignment in finetune() to use Dataset ins…

c81565f

…tead of tuple

SunsetWolf merged commit 2b41782 into main Nov 13, 2025
97 of 103 checks passed

SunsetWolf deleted the fix/gbdt-finetune-dataset-tuple branch November 13, 2025 03:50

you-n-g mentioned this pull request Nov 10, 2025

chore(main): release 0.9.8 #1989

Open

SunsetWolf mentioned this pull request Nov 13, 2025

The finetune function of model LightGBM has an ERROR #2048

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(gbdt): correct dtrain assignment in finetune() to use Dataset instead of tuple #2049

fix(gbdt): correct dtrain assignment in finetune() to use Dataset instead of tuple #2049

Uh oh!

SunsetWolf commented Nov 12, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix(gbdt): correct dtrain assignment in finetune() to use Dataset instead of tuple #2049

fix(gbdt): correct dtrain assignment in finetune() to use Dataset instead of tuple #2049

Uh oh!

Conversation

SunsetWolf commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

How Has This Been Tested?

Screenshots of Test Results (if appropriate):

Types of changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SunsetWolf commented Nov 12, 2025 •

edited

Loading