Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python-package] Load parameters from model string #6852

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

Samsagax
Copy link

@Samsagax Samsagax commented Mar 2, 2025

This fixes #6851 by using the same workaround as when loading the model from a file (#5424).

Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time to contribute!

As I just mentioned in #6851 (comment), think the issue goes deeper than just this and that #6101 will be the right fix.

That said, if you want to pursue this PR for just the in-memory string part (not also fixing copy()), then I'm open to it.... but please add a new test covering this behavior.

Otherwise, if you don't have the time/interest in that, we can close this and you can subscribe to #6101.

@Samsagax
Copy link
Author

Samsagax commented Mar 3, 2025

I'll add the requested tests for this based on the minimal example in #6851 . I would like to make this working even if it doesn't fix the copy part since I'll be using the string from memory in my use case (string retrieved from a database elsewhere).

Thanks for taking the time to review it.

@Samsagax Samsagax force-pushed the fix-param-from-string branch from ef2cc77 to 299b83a Compare March 3, 2025 14:10
@Samsagax Samsagax requested a review from jameslamb March 3, 2025 14:10
@Samsagax
Copy link
Author

Samsagax commented Mar 4, 2025

Ok. Test case added. Should also be useful for #6101

Thanks for having the time to review this.

Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much! I'm glad this worked and I think it's a really nice improvement.

I left some suggestions to make the tests stricter. Can you please also move that test case to right after the test on the similar logic for loading from a model file?

def test_parameters_are_loaded_from_model_file(tmp_path, capsys, rng):

Samsagax and others added 7 commits March 6, 2025 10:33
Co-authored-by: James Lamb <jaylamb20@gmail.com>
…case

Co-authored-by: James Lamb <jaylamb20@gmail.com>
Co-authored-by: James Lamb <jaylamb20@gmail.com>

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
@Samsagax Samsagax requested a review from jameslamb March 6, 2025 16:08
@Samsagax
Copy link
Author

Samsagax commented Mar 6, 2025

I think I've done everything as requested. Should I rebase/squash commits into a more recipe-like structure?

Samsagax and others added 2 commits March 6, 2025 13:15

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks great!

As a quick note... I already see the linting failed: https://github.com/microsoft/LightGBM/actions/runs/13703032288/job/38321483346?pr=6852

You can replicate that locally (and auto-fix the formatting issues) like this:

pip install pre-commit
pre-commit run --all-files

No need to squash commits... we squash on merges here, so every pull request = 1 commit on master: https://github.com/microsoft/LightGBM/commits/master/

You should merge in the latest changes whenever updating though... that helps ensure this is tested against the latest state of the target branch. I just did that, by clicking this button:

image

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Co-authored-by: James Lamb <jaylamb20@gmail.com>
Samsagax and others added 3 commits March 6, 2025 13:19
Co-authored-by: James Lamb <jaylamb20@gmail.com>

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Co-authored-by: James Lamb <jaylamb20@gmail.com>

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
@Samsagax
Copy link
Author

Samsagax commented Mar 6, 2025

As a quick note... I already see the linting failed: https://github.com/microsoft/LightGBM/actions/runs/13703032288/job/38321483346?pr=6852

You can replicate that locally (and auto-fix the formatting issues) like this:

pip install pre-commit
pre-commit run --all-files

Thanks

No need to squash commits... we squash on merges here, so every pull request = 1 commit on master: https://github.com/microsoft/LightGBM/commits/master/

You should merge in the latest changes whenever updating though... that helps ensure this is tested against the latest state of the target branch. I just did that, by clicking this button:

I prefer rebasing and rewriting it, but each project has it's own nit-picks. Will do.

Thanks again. The latest version will come in shortly

@Samsagax Samsagax requested a review from jameslamb March 6, 2025 16:33
@Samsagax
Copy link
Author

Samsagax commented Mar 6, 2025

Don't really know why r packaging is failing. But seems to be failing in master too.

@jameslamb jameslamb changed the title [python-package] Fill in params when loading from string [python-package] Load parameters from model string Mar 7, 2025
@jameslamb
Copy link
Collaborator

Don't really know why r packaging is failing. But seems to be failing in master too.

Yes you're right, that job failing is unrelated to your changes. Documented at #6855

Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much! I've changed the label to breaking, because this is technically a breaking change from the perspective of anyone who was relying on params passed through the Booster constructor when loading a file, like this:

bst = Booster(model_str=model_str, params={...})

@jmoralez do you have some time to review this too? You're so familiar with these code paths from #5424, I think that'd be really helpful.

I approve of these changes as-is (so nothing else you need to do @Samsagax), but just leaving a "comment" review so my approval won't count towards a merge until we get another approval.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[python-package] Model parameters are lost when loading Booster from model serialized string
2 participants