KoalaEnsembles

For building and training ensembles of supervised learning models (e.g., random forests) in the Koala machine learning environment.

Basic usage

julia> using Koala
julia> using KoalaEnsembles
julia> forest = EnsembleRegressor()
EnsembleRegressor{TreeRegressor}@...192

julia> showall(ans)
EnsembleRegressor{TreeRegressor}@...192
	
key                     | value
------------------------|------------------------
atom                    |TreeRegressor@...292
bagging_fraction        |1.0
n                       |20
weight_regularization   |0.0

The parameter atom is the model being ensembled. The default is an extreme tree:

julia> showall(forest.atom)
TreeRegressor@...292

key                     | value
------------------------|------------------------
extreme                 |true
max_features            |0
max_height              |1000
min_patterns_split      |2
penalty                 |0.0
regularization          |0.0

julia> const X, y = load_ames(); 
julia> train, test = split(1:length(y), 0.8); # 80:20 split 
julia> forest.atom.max_features = 4
julia> forestM = Machine(forest, X, y, train)
SupervisedMachine{EnsembleRegressor,}@...968

julia> fit!(forestM, train)
Computing regressor number: 20    
SupervisedMachine{EnsembleRegressor,}@...968

julia> err(forestM, test) # RMS error on test data
31433.941469754147

Weight regularization

Commonly in ensemble methods, predictions are the means of the predictions of each regressor in the ensemble. Here predictions are weighted sums and the weights are optimized to minimize the RMS training error. Since this sometimes leads to "under-regularized" models the training error is further penalized with a term measuring the deviation of the weights from uniformity. Set the parameter forest.weight_regularization=1 (the default and maximum permitted value) and weights are completely uniform. Set forest.weight_regularization=0 and the training error penalty is dropped altogether.

julia> forest.weight_regularization
1.0

julia> forestM.report
Dict{Symbol,Any} with 1 entry:
:normalized_weights => [0.05, 0.05, 0.05, 0.05,…

To refit the weights with a new regularization penalty, but without changing the ensemble itself, use fit_weights:

julia> forest.weight_regularization = 0.5
julia> fit_weights!(forestM)

20 weights optimized.                
SupervisedMachine{EnsembleRegressor,}@...968

julia> forestM.report
Dict{Symbol,Any} with 1 entry:
:normalized_weights => [1.0196, 0.640301, -0.265808,…

Tuning the parameter forest.weight_regularization is facilitated by the weight_regularization_curve function:

julia> weights, errors = weight_regularization_curve(forestM, test; range = linspace(0,1,51));
julia> using UnicodePlots

julia> lineplot(weights, errors)
         ┌────────────────────────────────────────────┐ 
   60000 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
         │⡆⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
         │⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
         │⢱⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
         │⢸⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
         │⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
         │⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
         │⠀⢸⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
         │⠀⠸⣀⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
         │⠀⠀⠀⠈⠑⠒⠢⠤⠤⠤⣀⣀⣀⣀⣀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│ 
         │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠉⠉⠉⠉⠉⠉⠑⠒⠒⠒⠒⠒⠒⠢⠤⠤⠤⠤⠤⠤⣀⣀⣀⡀⠀│ 
   30000 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠙│ 
         └────────────────────────────────────────────┘ 
         0                                        1

julia> opt_weight = weights[indmin(errors)]

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
src		src
test		test
.codecov.yml		.codecov.yml
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE.md		LICENSE.md
README.md		README.md
REQUIRE		REQUIRE
appveyor.yml		appveyor.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src

src

test

test

.codecov.yml

.codecov.yml

.gitignore

.gitignore

.travis.yml

.travis.yml

LICENSE.md

LICENSE.md

README.md

README.md

REQUIRE

REQUIRE

appveyor.yml

appveyor.yml

Repository files navigation

KoalaEnsembles

Basic usage

Weight regularization

About

Releases

Packages

Languages

License

ablaom/KoalaEnsembles.jl

Folders and files

Latest commit

History

Repository files navigation

KoalaEnsembles

Basic usage

Weight regularization

About

Resources

License

Stars

Watchers

Forks

Languages