Storing intermediate results of a Composite Model #841

olivierlabayle · 2021-09-16T17:09:55Z

Hi!

Is your feature request related to a problem? Please describe.
I am trying to use the learning network API and would like to store additional results in the fitresult of my composite model. Could you provide some guidance on how to do this properly?

Describe the solution you'd like
Ideally I'd like to be able to store the value of any node that was computed at training time.

Describe alternatives you've considered
It seems that only the submodels fitresults are natively stored, so one way to do it I guess would to define some kind of ResultModel as a submodel for whatever value I would like and compute the result in the fit! function of this model.

Additional context

I must add that the learning network I am trying to build is not regular in that it will never be used for prediction however I feel that what I'm trying to do may be of general use in MLJ.

For instance, the following works fine except that I can't retrieve the value of the final node because of the anonymization in the return!. Moreover I don't think this is appropriate as I guess all computations (except fitting) would be made again each time I call the node right?

using MLJ

LinearRegressor = @load LinearRegressor pkg=MLJLinearModels verbosity = 0


mutable struct MyModel <: MLJ.DeterministicComposite
    model
end


function MLJ.fit(m::MyModel, verbosity, X, y)
    Xs = source(X)
    ys = source(y)
    mach = machine(m.model, Xs, ys)
    ypred = MLJ.predict(mach, Xs)
    μpred = node(x->mean(x), ypred)
    σpred = node((x, μ)->mean((x.-μ).^2), ypred, μpred)
    mach = machine(Deterministic(), Xs, ys; predict=σpred)
    fitresult, cache, report = return!(mach, m, verbosity)
    mach.fitresult = (σpred=σpred, fitresult...)
    return mach.fitresult, cache, report
end

X, y = make_regression(500, 5)
mach = machine(MyModel(LinearRegressor()), X, y)
fit!(mach)
fitted_params(mach)
mach.fitresult.σpred()

The text was updated successfully, but these errors were encountered:

ablaom · 2021-09-17T01:03:50Z

@olivierlabayle Thanks for raising this interesting question about creating new interface points for composite models.

Of course, if you are not interested in the ordinary predict output, you could just define the predict node to be σpred with return!(machine(Xs, ys, predict=σpred), model, verbosity). But I don't think that is what you are getting at right? You are looking to add ways of accessing information in addition to what can be extracted from predict and transform (you can already define both, incidentally).

But I am interested in clarifying exactly what you want here. I see that there are two possible objectives here. Do you want the output of σpred on the training data, to be recorded somehow in the report or fitted params or are effectively seeking to add a new operation that can be called on new data, like a predict operation? That is, are we trying to record extra data as a bi-product of training, or do we want to add extra functions that dispatch on both new data and the outcomes of training?

olivierlabayle · 2021-09-17T09:11:49Z

@ablaom Thanks for getting back at me so quickly.

I am working in causality, which means that my scenario differs from the traditional MLJ framework in the following ways:

I don't have data as (X, y) but rather (X, W, y).
I don't really have a predict time. In MLJ the learning algorithm outputs a prediction function, while I am only interested in outputing a real number (or vector).

The reason why I am so interested in the learning network API is that I think it provides a nice caching and scheduling mechanism. For instance, again in my use case, I might want to change one hyperparameter of model3 (see below) so that the whole procedure will not refit model1 and model2 because their upstream has not changed.

To cut it short, I think using the predict node (or more reasonably defining a new operation node) might work for me (as in the following ) but I don't want the computations to happen twice. Moreover this currently doesn't work because predict expects the data to be (X, y). The other solution would be to record some state of information at fit time as you mention, it seems both more appropriate for my use case and still useful fo general MLJ users (For instance I initially wanted to report the scores of the learners in the Stack). For general MLJ users that would be in addition to the predict function and for me it would be all I require.

Hope this helps!

using MLJ

LinearRegressor = @load LinearRegressor pkg=MLJLinearModels verbosity = 0


mutable struct MyModel <: MLJ.DeterministicComposite
    model1
    model2
    model3
end


function MLJ.fit(m::MyModel, verbosity, X, W, y)
    Xs = source(X)
    Ws = source(W)
    ys = source(y)

    mach1 = machine(m.model1, Xs, ys)
    mach2 = machine(m.model2, Ws, ys)

    ypred1 = MLJ.predict(mach1, Xs)
    ypred2 = MLJ.predict(mach2, Ws)

    Y = hcat(ypred1, ypred2)

    mach3 = machine(m.model3, Y, ys)

    ypred3 = MLJ.predict(mach3, Y)

    μpred = node(x->mean(x), ypred3)
    σpred = node((x, μ)->mean((x.-μ).^2), ypred3, μpred)

    estimate = node((μ, σ2)->(μ, σ2), μpred, σpred)

    mach = machine(Deterministic(), Xs, ys; predict=estimate)

    return!(mach, m, verbosity)

end

X, y = make_regression(500, 5)
model = MyModel(LinearRegressor(), LinearRegressor(), LinearRegressor())
mach = machine(model, X, X, y)
fit!(mach)
estimate = MLJ.predict(mach)

ablaom · 2021-09-23T06:18:31Z

@olivierlabayle I've played around with this a bit today and will get your feedback on one experiment in the next day or so.

olivierlabayle · 2021-09-23T17:22:49Z

@ablaom That's great, very happy to hear that, thanks a lot!

ablaom · 2021-09-24T00:45:43Z

@olivierlabayle Please have a look at JuliaAI/MLJBase.jl#644 which addresses the original suggestion and give me your feedback.

I think in the immediate term causal inference with targeted learning is out-of-scope. My focus for the next few months will be moving towards version 1.0.

Perhaps you can hack around the other obstacles for now, eg by exporting a predict node that you have no intention of using.

You might also want to conceptualise your model as a transformer with a single tuple (X, W, y) as input, which you split up.

olivierlabayle · 2021-09-24T15:40:54Z

Yes I understand and I wasn't planning on having a dedicated MLJ structure for this. As you say, I will be hacking a bit, for now it's a mode with unused predict node but I like the transformer idea. I think with this pull request I should be good to go and benefit from the learning network machinery.

davnn · 2021-10-05T06:41:05Z

Wouldn't it be more intuitive/self-explanatory to add a report kwarg to the surrogate machine call that takes a named tuple input? It would also allow fitted_params if it's necessary at some point in the future.

mach = machine(Deterministic(), Xs, ys; predict=ypred3, μpred=μpred, σpred=σpred)

would become

mach = machine(Deterministic(), Xs, ys; predict=ypred3, report=(μpred=μpred, σpred=σpred))

@ablaom I also stumbled over this issue while implementing composite detectors, which should store training scores in the report for the composite model.

ablaom · 2021-10-05T19:35:01Z

@davnn Thanks for chiming in here.

I also thought of this, but it seemed a bit more complicated. But yes, as you say this may be "more intuitive/self-explanatory". I should be happy to make that change.

which should store training scores in the report for the composite model.

Ah, yes, I can imagine that could be so. Does this mean we need to expedite this somewhat? Currently this is low on my priorities as I am swamped with other stuff.

davnn · 2021-10-06T05:38:25Z

Ah, yes, I can imagine that could be so. Does this mean we need to expedite this somewhat? Currently this is low on my priorities as I am swamped with other stuff.

Nope, consider it low prio as well, just using a custom return! for now.

olivierlabayle · 2022-01-04T14:41:31Z

@ablaom Thank you for managing to do this feature!

davnn · 2022-08-18T11:07:05Z

I'm having a difficult time converting my custom return! to the new MLJ API (added in JuliaAI/MLJBase.jl#644). Previously, I could just use

function return_with_scores!(network_mach, model, verbosity, scores_train, X)
    fitresult, cache, report = MLJ.return!(network_mach, model, verbosity)
    report = merge(report, (scores=scores_train(X),))
    return fitresult, cache, report
end

instead of return! to add a scores field to the report named tuple. Using the same function with the new MLJ API results in a report = (..., additions = (scores = [1,2,3],...)), which means that there is no longer a unified API (between composite and individual models) to access the training scores. I would now have to check everywhere if the model is a composite and use report.additions.scores, or is there a better solution?

ablaom · 2022-08-19T02:13:03Z

@davnn Good point. I suggest we add a raw_training_scores accessor function as suggested in the tracking issue cross-referenced above.

What do you think?

davnn · 2022-08-21T19:40:10Z

Thank you for your detailed thoughts on how we could go forward. I need some more time to think about it. I'm a bit afraid of feature creep in MLJ, but maybe that's not a big problem.

ablaom · 2022-08-23T01:30:35Z

Alternatively, we could introduce more generic accessor functions, training_predictions(model, fitresult, report) and training_transformations(model, fitresult, report) which, when implemented, are syntactically equivalent to predict(model, fitresult, Xtrain) and transform(model, fitresult, Xtrain) but more efficient, because they just extract data pre-computed at fit time (and available in fitresult or report)? Mmm, might be a bit abstract for users?

In your use case, you overload training_transformations to return training raw scores for all detectors: for regular detectors, this is report.scores (or whatever - I forget what you call them) and for composite models it's report.additions.scores.

davnn · 2022-08-23T17:18:55Z

I would prefer to keep the API simple with a report that can flexibly accommodate predictions, transformations or whatever the algorithm could produce. Strangely enough, predict(model, fitresult, Xtrain) would NOT result in the training scores observed during fit for neighbor-based methods, because predict would compare the points in Xtrain to Xtrain, but fit ignores the first (trivial) neighbor.

It might make sense to follow the uniform access principle for things like the models' report, i.e. discourage or even disallow direct access to model intrinsics such as model.report and encourage report(model), which could be easily customized on a per-model basis to return any custom format.

ablaom · 2022-08-29T03:08:57Z

Thanks for these points. I have some ideas about how to do this properly (and also how to greatly simplify the learning networks "export" process) but it's going to take a little time. I will keep you posted, and I appreciate your patience.

ablaom added the design discussion Discussing design issues label Sep 23, 2021

ablaom mentioned this issue Sep 24, 2021

Add interface points for accessing the internal state of an exported learning network composite model JuliaAI/MLJBase.jl#644

Merged

ablaom self-assigned this Sep 28, 2021

ablaom mentioned this issue Aug 19, 2022

[Tracking issue] Add raw_training_scoresaccessor function #960

Closed

10 tasks

ablaom mentioned this issue Aug 29, 2022

What's wrong with the way we export learning networks as new model types JuliaAI/MLJBase.jl#831

Closed

davnn mentioned this issue Sep 13, 2022

Please tag a new release OutlierDetectionJL/OutlierDetection.jl#33

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Storing intermediate results of a Composite Model #841

Storing intermediate results of a Composite Model #841

olivierlabayle commented Sep 16, 2021

ablaom commented Sep 17, 2021 •

edited

Loading

olivierlabayle commented Sep 17, 2021 •

edited

Loading

ablaom commented Sep 23, 2021

olivierlabayle commented Sep 23, 2021

ablaom commented Sep 24, 2021 •

edited

Loading

olivierlabayle commented Sep 24, 2021

davnn commented Oct 5, 2021 •

edited

Loading

ablaom commented Oct 5, 2021

davnn commented Oct 6, 2021

olivierlabayle commented Jan 4, 2022

davnn commented Aug 18, 2022

ablaom commented Aug 19, 2022

davnn commented Aug 21, 2022

ablaom commented Aug 23, 2022

davnn commented Aug 23, 2022

ablaom commented Aug 29, 2022

Storing intermediate results of a Composite Model #841

Storing intermediate results of a Composite Model #841

Comments

olivierlabayle commented Sep 16, 2021

ablaom commented Sep 17, 2021 • edited Loading

olivierlabayle commented Sep 17, 2021 • edited Loading

ablaom commented Sep 23, 2021

olivierlabayle commented Sep 23, 2021

ablaom commented Sep 24, 2021 • edited Loading

olivierlabayle commented Sep 24, 2021

davnn commented Oct 5, 2021 • edited Loading

ablaom commented Oct 5, 2021

davnn commented Oct 6, 2021

olivierlabayle commented Jan 4, 2022

davnn commented Aug 18, 2022

ablaom commented Aug 19, 2022

davnn commented Aug 21, 2022

ablaom commented Aug 23, 2022

davnn commented Aug 23, 2022

ablaom commented Aug 29, 2022

ablaom commented Sep 17, 2021 •

edited

Loading

olivierlabayle commented Sep 17, 2021 •

edited

Loading

ablaom commented Sep 24, 2021 •

edited

Loading

davnn commented Oct 5, 2021 •

edited

Loading