Skip to content

Implement the MLJ model API without needing to depend on external dependencies such as CSV.jl, CategoricalArrays.jl, etc. #19

@DilumAluthge

Description

@DilumAluthge

As far as I can tell, in order for me to implement the MLJ model API, I need to import MLJBase.jl.

While MLJBase.jl is a more lightweight dependency than MLJ.jl, it still does have quite a few dependencies. I would rather not have to depend on CSV.jl, CategoricalArrays.jl, Tables.jl, etc. in order to be able to implement the MLJ model API.

Take this modified version of the simple deterministic regressor example:

import MLJBase
using LinearAlgebra

mutable struct MyRegressor <: MLJBase.Deterministic
    lambda::Float64
end
MyRegressor(; lambda=0.1) = MyRegressor(lambda)

# fit returns coefficients minimizing a penalized rms loss function:
function MLJBase.fit(model::MyRegressor, X, y)
    fitresult = (X'X - model.lambda*I)\(X'y)  # the coefficients
    return fitresult
end

# predict uses coefficients to make new prediction:
MLJBase.predict(model::MyRegressor, fitresult, Xnew) = Xnew*fitresult

I can define this entire model without using any dependencies. Unfortunately, because I need to import MLJBase.jl, I still end up depending on all of MLJBase.jl's dependencies.

The JuliaData people have solved this problem by creating the DataAPI.jl package. DataAPI.jl is a tiny package that has no dependencies and provides the namespace for the JuliaData API.

Would you be open to creating a similar MLJapi.jl package? The package would be very simple. It would have no dependencies, and its only content would consist of type definitions and function stubs, for example:

abstract type MLJType end
abstract type Model <: MLJType end
abstract type Supervised <: Model end 
...

function fit end
function update end
function predict end
...

MLJ.jl, MLJBase.jl, MLJModels.jl, etc. would import MLJapi.jl and extend its functions.

This would allow other package authors to implement the MLJ model API without needing to depend on all of MLJBase's dependencies.

Would you be willing to adopt this approach? If so, I'd be more than happy to help create the MLJapi.jl package.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions