In [1]:
# libraries
using Flux              # the julia ml library
using Images            # image processing and machine vision for julia
using MLJ               # make_blobs, rmse, confmat, categorical
using MLDataUtils       # label, nlabel, labelfreq
using MLDatasets        # mnist

using GLM               # (lm works as regression; GLM not OK for categorical outcomes)
using MLJLinearModels   # LogisticClassifier

using LinearAlgebra     # pinv pseudo-inverse matrix
using Metrics           # r2-score
using Random
using StatsBase         # standardize (normalization)
using Distributions

using Plots; gr()
using StatsPlots
using Printf

using CSV
using DataFrames


In [2]:
all_models = models()

186-element Vector{NamedTuple{(:name, :package_name, :is_supervised, :abstract_type, :deep_properties, :docstring, :fit_data_scitype, :hyperparameter_ranges, :hyperparameter_types, :hyperparameters, :implemented_methods, :inverse_transform_scitype, :is_pure_julia, :is_wrapper, :iteration_parameter, :load_path, :package_license, :package_url, :package_uuid, :predict_scitype, :prediction_type, :supports_class_weights, :supports_online, :supports_training_losses, :supports_weights, :transform_scitype, :input_scitype, :target_scitype, :output_scitype), T} where T<:Tuple}:
 (name = ABODDetector, package_name = OutlierDetectionNeighbors, ... )
 (name = ABODDetector, package_name = OutlierDetectionPython, ... )
 (name = AEDetector, package_name = OutlierDetectionNetworks, ... )
 (name = ARDRegressor, package_name = ScikitLearn, ... )
 (name = AdaBoostClassifier, package_name = ScikitLearn, ... )
 (name = AdaBoostRegressor, package_name = ScikitLearn, ... )
 (name = AdaBoostStumpClassifier, pa

In [3]:
model = @load LogisticClassifier pkg="MLJLinearModels"

┌ Info: For silent loading, specify `verbosity=0`. 
└ @ Main /home/ciro/.julia/packages/MLJModels/Ci1zC/src/loading.jl:168


import MLJLinearModels ✔


LogisticClassifier

In [5]:
?learning_curve

search: [0m[1ml[22m[0m[1me[22m[0m[1ma[22m[0m[1mr[22m[0m[1mn[22m[0m[1mi[22m[0m[1mn[22m[0m[1mg[22m[0m[1m_[22m[0m[1mc[22m[0m[1mu[22m[0m[1mr[22m[0m[1mv[22m[0m[1me[22m [0m[1ml[22m[0m[1me[22m[0m[1ma[22m[0m[1mr[22m[0m[1mn[22m[0m[1mi[22m[0m[1mn[22m[0m[1mg[22m[0m[1m_[22m[0m[1mc[22m[0m[1mu[22m[0m[1mr[22m[0m[1mv[22m[0m[1me[22m!



```
curve = learning_curve(mach; resolution=30,
                             resampling=Holdout(),
                             repeats=1,
                             measure=default_measure(machine.model),
                             rows=nothing,
                             weights=nothing,
                             operation=nothing,
                             range=nothing,
                             acceleration=default_resource(),
                             acceleration_grid=CPU1(),
                             rngs=nothing,
                             rng_name=nothing)
```

Given a supervised machine `mach`, returns a named tuple of objects suitable for generating a plot of performance estimates, as a function of the single hyperparameter specified in `range`. The tuple `curve` has the following keys: `:parameter_name`, `:parameter_scale`, `:parameter_values`, `:measurements`.

To generate multiple curves for a `model` with a random number generator (RNG) as a hyperparameter, specify the name, `rng_name`, of the (possibly nested) RNG field, and a vector `rngs` of RNG's, one for each curve. Alternatively, set `rngs` to the number of curves desired, in which case RNG's are automatically generated. The individual curve computations can be distributed across multiple processes using `acceleration=CPUProcesses()` or `acceleration=CPUThreads()`. See the second example below for a demonstration.

```julia
X, y = @load_boston;
atom = @load RidgeRegressor pkg=MultivariateStats
ensemble = EnsembleModel(atom=atom, n=1000)
mach = machine(ensemble, X, y)
r_lambda = range(ensemble, :(atom.lambda), lower=10, upper=500, scale=:log10)
curve = learning_curve(mach; range=r_lambda, resampling=CV(), measure=mav)
using Plots
plot(curve.parameter_values,
     curve.measurements,
     xlab=curve.parameter_name,
     xscale=curve.parameter_scale,
     ylab = "CV estimate of RMS error")
```

If using a `Holdout()` `resampling` strategy (with no shuffling) and if the specified hyperparameter is the number of iterations in some iterative model (and that model has an appropriately overloaded `MLJModelInterface.update` method) then training is not restarted from scratch for each increment of the parameter, ie the model is trained progressively.

```julia
atom.lambda=200
r_n = range(ensemble, :n, lower=1, upper=250)
curves = learning_curve(mach; range=r_n, verbosity=0, rng_name=:rng, rngs=3)
plot!(curves.parameter_values,
     curves.measurements,
     xlab=curves.parameter_name,
     ylab="Holdout estimate of RMS error")


```

```
learning_curve(model::Supervised, X, y; kwargs...)
learning_curve(model::Supervised, X, y, w; kwargs...)
```

Plot a learning curve (or curves) directly, without first constructing a machine.

### Summary of key-word options

  * `resolution` - number of points generated from `range` (number model evaluations); default is `30`
  * `acceleration` - parallelization option for passing to `evaluate!`; an instance of `CPU1`, `CPUProcesses` or `CPUThreads` from the `ComputationalResources.jl`; default is `default_resource()`
  * `acceleration_grid` - parallelization option for distributing each performancde evaluation
  * `rngs` - for specifying random number generator(s) to be passed to the model (see above)
  * `rng_name` - name of the model hyper-parameter representing a random number generator (see above); possibly nested

Other key-word options are documented at [`TunedModel`](@ref).


In [6]:
?round

search: [0m[1mr[22m[0m[1mo[22m[0m[1mu[22m[0m[1mn[22m[0m[1md[22m [0m[1mr[22m[0m[1mo[22m[0m[1mu[22m[0m[1mn[22m[0m[1md[22ming [0m[1mR[22m[0m[1mo[22m[0m[1mu[22m[0m[1mn[22m[0m[1md[22mUp [0m[1mR[22m[0m[1mo[22m[0m[1mu[22m[0m[1mn[22m[0m[1md[22mDown [0m[1mR[22m[0m[1mo[22m[0m[1mu[22m[0m[1mn[22m[0m[1md[22mToZero [0m[1mR[22m[0m[1mo[22m[0m[1mu[22m[0m[1mn[22m[0m[1md[22mingMode [0m[1mR[22m[0m[1mo[22m[0m[1mu[22m[0m[1mn[22m[0m[1md[22mNearest



```
round(z::Complex[, RoundingModeReal, [RoundingModeImaginary]])
round(z::Complex[, RoundingModeReal, [RoundingModeImaginary]]; digits=, base=10)
round(z::Complex[, RoundingModeReal, [RoundingModeImaginary]]; sigdigits=, base=10)
```

Return the nearest integral value of the same type as the complex-valued `z` to `z`, breaking ties using the specified [`RoundingMode`](@ref)s. The first [`RoundingMode`](@ref) is used for rounding the real components while the second is used for rounding the imaginary components.

# Example

```jldoctest
julia> round(3.14 + 4.5im)
3.0 + 4.0im
```

---

```
round([T,] x, [r::RoundingMode])
round(x, [r::RoundingMode]; digits::Integer=0, base = 10)
round(x, [r::RoundingMode]; sigdigits::Integer, base = 10)
```

Rounds the number `x`.

Without keyword arguments, `x` is rounded to an integer value, returning a value of type `T`, or of the same type of `x` if no `T` is provided. An [`InexactError`](@ref) will be thrown if the value is not representable by `T`, similar to [`convert`](@ref).

If the `digits` keyword argument is provided, it rounds to the specified number of digits after the decimal place (or before if negative), in base `base`.

If the `sigdigits` keyword argument is provided, it rounds to the specified number of significant digits, in base `base`.

The [`RoundingMode`](@ref) `r` controls the direction of the rounding; the default is [`RoundNearest`](@ref), which rounds to the nearest integer, with ties (fractional values of 0.5) being rounded to the nearest even integer. Note that `round` may give incorrect results if the global rounding mode is changed (see [`rounding`](@ref)).

# Examples

```jldoctest
julia> round(1.7)
2.0

julia> round(Int, 1.7)
2

julia> round(1.5)
2.0

julia> round(2.5)
2.0

julia> round(pi; digits=2)
3.14

julia> round(pi; digits=3, base=2)
3.125

julia> round(123.456; sigdigits=2)
120.0

julia> round(357.913; sigdigits=4, base=2)
352.0
```

!!! note
    Rounding to specified digits in bases other than 2 can be inexact when operating on binary floating point numbers. For example, the [`Float64`](@ref) value represented by `1.15` is actually *less* than 1.15, yet will be rounded to 1.2.

    # Examples

    ```jldoctest; setup = :(using Printf)
    julia> x = 1.15
    1.15

    julia> @sprintf "%.20f" x
    "1.14999999999999991118"

    julia> x < 115//100
    true

    julia> round(x, digits=1)
    1.2
    ```


# Extensions

To extend `round` to new numeric types, it is typically sufficient to define `Base.round(x::NewType, r::RoundingMode)`.

---

```
round(dt::TimeType, p::Period, [r::RoundingMode]) -> TimeType
```

Return the `Date` or `DateTime` nearest to `dt` at resolution `p`. By default (`RoundNearestTiesUp`), ties (e.g., rounding 9:30 to the nearest hour) will be rounded up.

For convenience, `p` may be a type instead of a value: `round(dt, Dates.Hour)` is a shortcut for `round(dt, Dates.Hour(1))`.

```jldoctest
julia> round(Date(1985, 8, 16), Dates.Month)
1985-08-01

julia> round(DateTime(2013, 2, 13, 0, 31, 20), Dates.Minute(15))
2013-02-13T00:30:00

julia> round(DateTime(2016, 8, 6, 12, 0, 0), Dates.Day)
2016-08-07T00:00:00
```

Valid rounding modes for `round(::TimeType, ::Period, ::RoundingMode)` are `RoundNearestTiesUp` (default), `RoundDown` (`floor`), and `RoundUp` (`ceil`).

---

```
round(x::Period, precision::T, [r::RoundingMode]) where T <: Union{TimePeriod, Week, Day} -> T
```

Round `x` to the nearest multiple of `precision`. If `x` and `precision` are different subtypes of `Period`, the return value will have the same type as `precision`. By default (`RoundNearestTiesUp`), ties (e.g., rounding 90 minutes to the nearest hour) will be rounded up.

For convenience, `precision` may be a type instead of a value: `round(x, Dates.Hour)` is a shortcut for `round(x, Dates.Hour(1))`.

```jldoctest
julia> round(Dates.Day(16), Dates.Week)
2 weeks

julia> round(Dates.Minute(44), Dates.Minute(15))
45 minutes

julia> round(Dates.Hour(36), Dates.Day)
2 days
```

Valid rounding modes for `round(::Period, ::T, ::RoundingMode)` are `RoundNearestTiesUp` (default), `RoundDown` (`floor`), and `RoundUp` (`ceil`).

Rounding to a `precision` of `Month`s or `Year`s is not supported, as these `Period`s are of inconsistent length.
