[python-package] reset_parameter() segfaults when passing an unrecognized parameter #6479

chris-hite-akuna · 2024-06-12T23:56:21Z

Description

reset_parameter segfaults on bad keys.

Reproducible example

m.reset_parameter({'kelp_var1': 123456789})
[LightGBM] [Warning] Unknown parameter: kelp_var1
Segmentation fault (core dumped)

Environment info

LightGBM version or commit hash:
lightgbm 4.3.0 py38h17151c0_0 conda-forge-remote
Command(s) you used to install LightGBM

conda upgrade lightgbm

Additional Comments

I'm looking for a nice way to put some user metadata about the training data into the model file, so I can avoid issues with them being used in the wrong context. For example, I'm filtering the training data. I've noticed I can modify the file directly and it does load/save it.

jameslamb · 2024-06-13T04:28:48Z

Thanks for using LightGBM.

We need more details than this to help you.

what operating system?
version of Python?
can you provide a minimal, reproducible example that demonstrates this behavior?

And just to set the right expectation... segfaulting should never happen so that part is a bug, but you cannot use reset_parameter() to track arbitrary custom data about a model. You'll have to do that some other way (for example, write out a JSON file next to wherever you store your model).

chris-hite-akuna · 2024-06-13T15:20:30Z

python --version
Python 3.8.10
uname -a
Linux cof-dev-l501 4.15.0-33-generic #36-Ubuntu SMP Wed Aug 15 16:00:05 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

import lightgbm as lgb
m = lgb.Booster(model_file='m0.txt')  # you'll need some model
type(m)
<class 'lightgbm.basic.Booster'>
m.reset_parameter({'kelp_var1': 123456789})
[LightGBM] [Warning] Unknown parameter: kelp_var1
Segmentation fault (core dumped)

I realize I'm asking for a feature there. Thanks for making it clear it doesn't exist. Yeah, we can work around. If I have a model that should only be used on Tuesdays, I can also add "tuesday" to the filename and try to encode that way. It would just be nice to have it internally in the file. I guess I can always make a feature request.

jameslamb · 2024-06-14T02:58:53Z

Thanks for that.

I personally would be -1 on the idea of LightGBM supporting storage of arbitrary extra data in model files. That'd add complexity and maintenance burden to this project for, in my opinion, not much value compared to just writing your own data alongside the model.

Write that data to another file and store it alongside the model. If you need to create a single artifact, write multiple files and zip them up in an archive with tar or zip or similar.

jameslamb added the bug label Jun 13, 2024

jameslamb changed the title ~~SEGV when setting~~ [python-package] reset_parameter() segfaults when passing an unrecognized parameter Jun 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[python-package] reset_parameter() segfaults when passing an unrecognized parameter #6479

[python-package] reset_parameter() segfaults when passing an unrecognized parameter #6479

chris-hite-akuna commented Jun 12, 2024

jameslamb commented Jun 13, 2024

chris-hite-akuna commented Jun 13, 2024

jameslamb commented Jun 14, 2024