Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: prevent mutation of categorical_feature inside stringifyparams #133

Merged
merged 5 commits into from
May 8, 2024

Conversation

characat0
Copy link
Contributor

Fixes #131

Assign to a new variable instead of assigning inplace. This is useful during CV where the model is trained many times.


This is a copy of #132 but from a cleaner branch

Assign to a new variable instead of assigning inplace.
This is useful during CV where the model is trained many times.
@@ -17,6 +17,16 @@ y_train_regression = rand(1000)
end


@testset "stringifyparams -- multiple calls won't mutate fields" begin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests also pass on the existing LightGBM without your change because the indices are being passed to categorical_feature so each time you call stringifyparams the values in the indices (assigned to categorical_features) will be also mutated, hence the test will always be true even without your change.

The scenario to be considered based on the #133 should be following:

categorical_features = [1, 3, 5, 7, 9]
expected_indices = [0, 2, 4, 6, 8] #as there's `-1` subtraction to match the C's zero-based indices.
classifier = LightGBM.LGBMClassification(categorical_feature = categorical_features)
LightGBM.stringifyparams(classifier; verbosity=-1)
@test expected_indices == classifier.categorical_feature

LightGBM.stringifyparams(classifier; verbosity=-1)
@test expected_indices == classifier.categorical_feature

With your current change the indices are not corrected and they remain [1, 3, 5, 7, 9] not [0, 2, 4, 6, 8] so this also needs to be revised.

This test sis better placed in test_fit.jl rather than utils as the stringifparams function is part of the fit.jl and the existing tests are in test_fit.jl

@kainkad
Copy link
Contributor

kainkad commented Jun 2, 2023

HI @characat0 thank you for raising the issue and submitting PR. I have left comments on your PR as it currently doesn't seem to address the issue raised. Are you ok to follow up on this?

@characat0
Copy link
Contributor Author

Thanks for the feedback, @kainkad. Will submit these changes.

@characat0
Copy link
Contributor Author

We missed a config for the test LGBM_BoosterUpdateOneIterCustom, in microsoft/LightGBM#5438 they added an explicit test for objective==nullptr

@kainkad
Copy link
Contributor

kainkad commented May 7, 2024

We missed a config for the test LGBM_BoosterUpdateOneIterCustom, in microsoft/LightGBM#5438 they added an explicit test for objective==nullptr

Thanks for fixing the mutated indices - it does work fine after the fix so happy to merge. I have a question on the null pointer, isn't it a default for the objective unless specified though?

@characat0
Copy link
Contributor Author

characat0 commented May 7, 2024

I have a question on the null pointer, isn't it a default for the objective unless specified though?

I thought that too, but looks like the default objective is 'regression', I've confirmed this using LightGBM.LGBM_BoosterSaveModelToString

@kainkad kainkad merged commit 1f01391 into IQVIA-ML:master May 8, 2024
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: categorical_feature is mutated inside stringifyparams
2 participants