Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python-package] reset_parameter() segfaults when passing an unrecognized parameter #6479

Open
chris-hite-akuna opened this issue Jun 12, 2024 · 3 comments
Labels

Comments

@chris-hite-akuna
Copy link

Description

reset_parameter segfaults on bad keys.

Reproducible example

m.reset_parameter({'kelp_var1': 123456789})
[LightGBM] [Warning] Unknown parameter: kelp_var1
Segmentation fault (core dumped)

Environment info

LightGBM version or commit hash:
lightgbm 4.3.0 py38h17151c0_0 conda-forge-remote
Command(s) you used to install LightGBM

conda upgrade lightgbm

Additional Comments

I'm looking for a nice way to put some user metadata about the training data into the model file, so I can avoid issues with them being used in the wrong context. For example, I'm filtering the training data. I've noticed I can modify the file directly and it does load/save it.

@jameslamb jameslamb added the bug label Jun 13, 2024
@jameslamb
Copy link
Collaborator

Thanks for using LightGBM.

We need more details than this to help you.

  • what operating system?
  • version of Python?
  • can you provide a minimal, reproducible example that demonstrates this behavior?

And just to set the right expectation... segfaulting should never happen so that part is a bug, but you cannot use reset_parameter() to track arbitrary custom data about a model. You'll have to do that some other way (for example, write out a JSON file next to wherever you store your model).

@chris-hite-akuna
Copy link
Author

python --version
Python 3.8.10
uname -a
Linux cof-dev-l501 4.15.0-33-generic #36-Ubuntu SMP Wed Aug 15 16:00:05 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

import lightgbm as lgb
m = lgb.Booster(model_file='m0.txt')  # you'll need some model
type(m)
<class 'lightgbm.basic.Booster'>
m.reset_parameter({'kelp_var1': 123456789})
[LightGBM] [Warning] Unknown parameter: kelp_var1
Segmentation fault (core dumped)

I realize I'm asking for a feature there. Thanks for making it clear it doesn't exist. Yeah, we can work around. If I have a model that should only be used on Tuesdays, I can also add "tuesday" to the filename and try to encode that way. It would just be nice to have it internally in the file. I guess I can always make a feature request.

@jameslamb jameslamb changed the title SEGV when setting [python-package] reset_parameter() segfaults when passing an unrecognized parameter Jun 14, 2024
@jameslamb
Copy link
Collaborator

Thanks for that.

I personally would be -1 on the idea of LightGBM supporting storage of arbitrary extra data in model files. That'd add complexity and maintenance burden to this project for, in my opinion, not much value compared to just writing your own data alongside the model.

Write that data to another file and store it alongside the model. If you need to create a single artifact, write multiple files and zip them up in an archive with tar or zip or similar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants