This package is a massive simplification of the original GradientBoost.jl package. I have decided to simplify it down to bottom by removing all the goodies? Why, I just needed a bare boosting for experiments, when I want to boost a some algorithm (like NNs.) I also updated an algorithm to use / be compatible with MLUtils.jl, LossFunctions.jl, use Optim.jl and use Zygote as a fallback for custom loss functions (ForwardDiff might be better here.)
The package is designed no sprinkle boosting on top of your ML algorithm. As such, it does not implement any algorithm to learn classifiers inside boosting. A simple example with decision stumpscan be found in test_ml.jl. A more sophisticated examples is in in example/mutagenesis.jl, where we show how to boost for classifying structured data. In the rest of this readme, the example with decision stumpts is commented.
Let's start by importing libraries and defining some training data.
using Test
using GradientBoost
using LossFunctions
x = Float64[
1 0 -1 0;
0 1 0 -1]
y = [-1, 1, -1, 1]Our classifier that we want to boost is a decision tree of length 1, called Decision Stump. The decision stump is simple, implementing a variant of a rule
xᵢ ≥ τ ? +1 : -1, where xᵢ is value of i-th feature. We define the decision stump as a simple callable struct.
struct Stump{T}
dim::Int
τ::T
s::Int
end
function (s::Stump)(x::AbstractVector)
x[s.dim] ≥ s.τ ? s.s : -s.s
end
function (s::Stump)(x::AbstractMatrix)
vec(mapslices(s, x, dims = 1))
endTo use Stump as a learner inside boosting algorithm, we need to overload learner_fit and learner_predict functions. Using mutiple dispatch, we can specialize fitting of different loss functions and different learners. For the purpose of dispatch, we define StumpLearner to signal that we want to learn a Stump and overload the learner_fit as
struct StumpLearner end
function GradientBoost.learner_fit(lf, learner::StumpLearner, x::AbstractMatrix, wy::Vector{<:Real})
w = abs.(wy)
y = sign.(wy)
best_stump = Stump(1, mean(x[1,:]), 1)
best_err = mean(w .* (y .!= best_stump(x)))
for dim in axes(x,1)
τs = 0.5(x[dim,2:end] + x[dim, 1:end-1])
for τ in τs
for s in [-1, +1]
e = mean(w .* (y .!= Stump(dim, τ, s)(x)))
if e < best_err
best_stump = Stump(dim, τ, s)
best_err = e
end
end
end
end
best_stump
endNow, we define a function providing the prediction as
function GradientBoost.learner_predict(::Loss, ::StumpLearner, s::Stump, x)
s(x)
end
Finally, the boosting is called as
gbl = GBBL(StumpLearner(); loss_function = ExpLoss, num_iterations=4, learning_rate=1, sampling_rate = 1)
model = fit(gbl, x, y)
predictions = GradientBoost.predict(model, x)
@asser 2(predictions .> 0) .- 1 == y- I got rid of ML api, as it does not served my purpose.
- The loss function has the signature
loss(prediction, true_labels) - I would like to thank the author of the original
GradientBoost.jllibrary. I just needed something supersimple.
References:
- Friedman, Jerome H. "Greedy function approximation: a gradient boosting machine." Annals of Statistics (2001): 1189-1232.
- Friedman, Jerome H. "Stochastic gradient boosting." Computational Statistics & Data Analysis 38.4 (2002): 367-378.
- Hastie, Trevor, et al. The elements of statistical learning. Vol. 2. No. 1. New York: Springer, 2009.
- Ridgeway, Greg. "Generalized Boosted Models: A guide to the gbm package." Update 1.1 (2007).
- Pedregosa, Fabian, et al. "Scikit-learn: Machine learning in Python." The Journal of Machine Learning Research 12 (2011): 2825-2830.
- Natekin, Alexey, and Alois Knoll. "Gradient boosting machines, a tutorial." Frontiers in neurorobotics 7 (2013).