Add task interface #68

ablaom · 2019-02-05T03:36:57Z

No description provided.

ablaom · 2019-02-05T03:56:03Z

Naive plan: to imitate MLR and/or OpenML.

@fkiraly has raised some issues:

the mlr design i.m.o. has some flaws.
A key pain point for me is the treatment of the data - where does it go? Is it part of the task (e.g., pointed to), or not? Is the task applied to the data, is the model applied to the task? And so forth.

We are meeting soon to discuss in conjunction with another project. Watch this space for further discussion. Suggestions welcome.

ablaom · 2019-03-04T23:23:00Z

@kirtsar

What is a Task? Here's the present design for the supervised tasks:

struct SupervisedTask{U} <: MLJTask # U is true for single target
    data       # a table
    targets  # list of names
    ignore::Vector{Symbol}  # list of names
    is_probabilistic
    target_scitype
    input_scitypes
    input_is_multivariate::Bool
end

In discussions at Turing there was consensus that tasks exclude description of evaluation (a point-of-departure from OpenML), although this is not cast in stone.

So, whether a task is regression or classifier is part of the task description, namely in target_scitype (which is actually a little more informative).

At present the the Task constructor assumes the data meets the spec outlined at doc/getting_started.md and infers the last three fields from the data. However, my idea is to eventually make the constructor more flexible, coercing data if necessary based on user-interaction. And the user could let the constructor make educated guesses about intended scientific type, and so forth.

The user might give the task contructor a kwarg target=MultiClass (ie classifier), and, supposing the target type is Int, then the target column is coerced into a CategoricalValue eltype. If no kwarg is given, then the constructor infers the scientific type from the data (in this case Count) and reports that it has done so.

The present design does suppose that, once the task is constructed, the data it wraps conforms to our standard. This aspect I would be reluctant to change at this point.

ablaom · 2019-03-04T23:23:39Z

Oops. Closed by accident.

tlienart · 2019-03-05T00:39:08Z

The present design does suppose that, once the task is constructed, the data it wraps conforms to our standard. This aspect I would be reluctant to change at this point.

It seems to me this is not too restrictive given that there can always be a "pre" step where the data is verified and/or coerced right?

fkiraly · 2019-03-05T15:01:05Z

Pasting @kirtsar 's comment from the merged issue #96:

"What should the Task do?
my vision of working with Task object is something like:
assume that we have some data for supervised learning: X, y. X and y can be any reasonable type (X is Matrix, DataFrame, ...; y is some subtype of AbstractVector).

task = Task(data = X, target = y, goal = SomeGoal(optional args) )

where SomeGoal is something from (for example):

Binary(is proba = true/false)
Multiclass(is proba = true/false)
Regression(is proba = true/false)
Based on the type of the task, the output for X_and_y should be appropriate (Continuous, discrete, ...)

"

fkiraly · 2019-03-05T15:03:32Z

@kirtsar I think the current design satisfies your requirements?
Except that X and y are not split explicitly, but only a column reference indicates what is y.
Which, in my opinion makes a lot of sense since there is other types of tasks where the specification is not easily done by splitting the data in two.

ablaom · 2019-06-05T21:10:22Z

A basic task interface is now in place. Let's open new issues for possible enhancements.

ablaom added enhancement New feature or request design discussion Discussing design issues labels Feb 5, 2019

ablaom mentioned this issue Feb 5, 2019

Add benchmarking tools #69

Open

ablaom mentioned this issue Mar 4, 2019

Task Design? #96

Closed

ablaom closed this as completed Mar 4, 2019

ablaom reopened this Mar 4, 2019

ablaom mentioned this issue Mar 27, 2019

Add type coercion to task constructors #109

Closed

ablaom closed this as completed Jun 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add task interface #68

Add task interface #68

ablaom commented Feb 5, 2019

ablaom commented Feb 5, 2019

ablaom commented Mar 4, 2019

ablaom commented Mar 4, 2019

tlienart commented Mar 5, 2019

fkiraly commented Mar 5, 2019

fkiraly commented Mar 5, 2019

ablaom commented Jun 5, 2019

Add task interface #68

Add task interface #68

Comments

ablaom commented Feb 5, 2019

ablaom commented Feb 5, 2019

ablaom commented Mar 4, 2019

ablaom commented Mar 4, 2019

tlienart commented Mar 5, 2019

fkiraly commented Mar 5, 2019

fkiraly commented Mar 5, 2019

ablaom commented Jun 5, 2019