Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid out_of_bag_measures not caught leading to misleading output #26

Open
ablaom opened this issue Mar 9, 2023 · 1 comment
Open
Assignees
Labels
bug Something isn't working

Comments

@ablaom
Copy link
Member

ablaom commented Mar 9, 2023

Unlike MLJBase.evaluate it seems MLJEnsembles design does not allow you to use deterministic measures for a probabilistic model. But rather than catching the offending measure, it silently applies it to give a wrong result:

using MLJ # or `MLJBase, MLJModels, MLJEnsembles` for minimal install

DecisionTreeClassifier = @iload DecisionTreeClassifier pkg=DecisionTree

atom = DecisionTreeClassifier()
model = EnsembleModel(
    atom;
    bagging_fraction=0.6,
    rng=123,
    out_of_bag_measure = [log_loss, accuracy],
)

X, y = @load_iris # a table and a vector

mach = machine(model, X, y) |> fit!
julia> report(mach).oob_measurements
2-element Vector{Float64}:
 2.0785173454390886
 0.0

The accuracy is not zero.

@ablaom ablaom added the bug Something isn't working label Mar 9, 2023
@ablaom
Copy link
Member Author

ablaom commented Mar 9, 2023

I wonder if some of MLJBase/resampling.jl can be used or factored out to do this properly, with deterministic measures supported. Or at least, how complicated would that be?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants