Add interface points for accessing the internal state of an exported learning network composite model #644

ablaom · 2021-09-24T00:05:37Z

This PR is attempt to address some issues raised in JuliaAI/MLJ.jl#841.

Supposing you have a learning network that you would like to export as a new stand-alone model type composite. Having bound composite to some data in a machine mach, you would like to arrange that ~~fitted_params(mach)~~ report(mach) will record some additional information about the internal state of the learning network that was built internally, when you called fit!(mach).

Specifically, this PR allows the following: Given any node N in the network, specified in the export process, you can arrange for the result of N() to be recorded in ~~fitted_params(mach)~~ report(mach). Naturally, the call N() happens immediately after the network is fit (and before the internal data anonymization step that empties the source nodes immediately thereafter).

Since N is called with no arguments, it will never see "production" data, which is a point of difference with the predict and/or transform nodes declared at export, which are always called on the production data Xnew, as in predict(mach, Xnew). However, this also means N can have multiple origin nodes (query origins for details). This is indeed the case in the following example, recording a training error in the composite model report:

(edited to reflect syntax adopted after discussions below)

using MLJ

import MLJModelInterface

struct MyModel <: ProbabilisticComposite
    model
end

function MLJModelInterface.fit(composite::MyModel, verbosity, X, y)

    Xs = source(X)
    ys = source(y)

    mach = machine(composite.model, Xs, ys)
    yhat = predict(mach, Xs)
    e = @node auc(yhat, ys)   # <------  node whose state we wish to export

    network_mach = machine(Probabilistic(),
                           Xs,
                           ys,
                           predict=yhat,
                           report=(training_error=e,))  # <------ how we export additional node(s)

    return!(network_mach, composite, verbosity)
end

# demo

X, y = make_moons()
composite = MyModel(ConstantClassifier())
mach = machine(composite, X, y) |> fit!
err = report(mach).training_error    # <------ accesssing the node state

yhat = predict(mach, rows=:);
@assert err ≈ auc(yhat, y)

This is preliminary proof of concept and criticism is most welcome.

The PR also needs a bit more unit testing.

codecov-commenter · 2021-09-24T00:22:33Z

Codecov Report

Merging #644 (0ec5472) into for-0-point-19-release (3676ea3) will increase coverage by 0.20%.
The diff coverage is 92.42%.

@@                    Coverage Diff                     @@
##           for-0-point-19-release     #644      +/-   ##
==========================================================
+ Coverage                   86.60%   86.81%   +0.20%     
==========================================================
  Files                          37       37              
  Lines                        3352     3389      +37     
==========================================================
+ Hits                         2903     2942      +39     
+ Misses                        449      447       -2

Impacted Files	Coverage Δ
src/MLJBase.jl	`92.85% <ø> (ø)`
src/composition/learning_networks/nodes.jl	`69.17% <ø> (+1.36%)`	⬆️
src/machines.jl	`84.02% <ø> (ø)`
src/composition/learning_networks/machines.jl	`90.36% <91.52%> (+2.66%)`	⬆️
src/composition/models/inspection.jl	`100.00% <100.00%> (ø)`
src/composition/models/methods.jl	`100.00% <100.00%> (ø)`
src/operations.jl	`80.76% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3676ea3...0ec5472. Read the comment docs.

olivierlabayle · 2021-09-24T15:33:48Z

The overall high level functionality seems to match what I had in mind and looks great from my perspective. I've added a high level test in that direction but it may be of little value, feel free to revert the commit if this is so.

I am afraid I can't provide a huge feedback on the details of the implementation as I don't fully understand how things are chained together under the hood. I've tried to dig a bit in the learning network implementation but with the asynchrononisity I find it difficult to understand the internals. Especially, I am wondering if the results of a node call are cached (if the node is called again for instance) ?

ablaom · 2021-09-25T22:57:17Z

After reflection, I think the extra information should go into the report rather than the fitted_params. Strictly speaking fittted_params is just for the "minimum" required to dispatch predict and/or transform. See this this guideline.

@olivierlabayle Any objections to this change? I understand that in your use-case fitted_params might feel more appropriate, but your use-case really sits outside the current intentions of the API, no?

Another possibility is to allow writing to both report and fitted_params, but I don't see how to design this in a way that's not more complicated, and am not sure the extra complication is warranted. But perhaps you have a suggestion?

ablaom · 2021-09-26T02:32:55Z

@olivierlabayle By the way, thanks for the extra test.

Especially, I am wondering if the results of a node call are cached (if the node is called again for instance) ?

The answer to the question I think you are asking is "yes". The call to the node is made immediately after fit! and recorded. When you inspect fitresult, you are inspecting this return value, not recalling the node (which wouldn't work anyway, because of data anonymization.)

Generally, the machines in a learning network and elsewhere cache data, unless they are constructed with machine(... ; cache=false) . If caching is turned off, evaluating a node is purely lazy.

A machine bound to a composite model (subtype Composite) does not cache data by default, although, as just mentioned, learning networks constructed under the hood generally do, as just explained.

Clear as mud, right?

olivierlabayle · 2021-09-26T20:51:23Z

No problem at all to export the results to the report instead of the fitted_params , as you say fitted_params should hold the parameter values of the learnt function.

olivierlabayle · 2021-09-26T21:00:37Z

Haha it's indeed hard to catch, I was actually wondering in a general manner as you describe second.

From what I "imagine", a fit! of the composite model will necessary call each node in the computational graph to trigger the different fits on the appropriate data. Some nodes are not bound to a machine (static) so they cannot be cached right? Does it mean this kind of node might be evaluated (computed) multiple times if asked for multiple times?

ablaom · 2021-09-26T22:09:01Z

From what I "imagine", a fit! of the composite model will necessary call each node in the computational graph to trigger the different fits on the appropriate data. Some nodes are not bound to a machine (static) so they cannot be cached right? Does it mean this kind of node might be evaluated (computed) multiple times if asked for multiple times?

Perhaps there is some confusion about what "caching" means here. The only caching that takes place is for the benefit of training machines. A machine constructed with cache=true internally caches data used to train it. Then, if a hyper-parameter changes, and I have no reason to believe the training nodes have changes the data they deliver if called, then I use the cached data in the next call to fit! the machine. The data cached is generally a model-specific representation of the data (eg, a matrix instead of a table). It was to avoid repeating these internal data pre-processing that caching was introduced (and to allow observation resampling to happen at the level of the model-specific representation).

If you have an static node that performs an expensive computation, then the only benefit caching has is if the output of the node is needed as training data for a machine downstream. However, if you are just calling a node downstream of that static node, the static node will need to re-compute. Similarly, if predict or transform are expensive operations for some internal machine, then caching data is only helpful for training machines downstream of those predict/transform nodes, but calling those nodes is still going to be expensive every time they are called.

Does that help?

olivierlabayle · 2021-09-26T23:15:44Z

Allright, thank you for the clarification, that helps a lot!

ablaom · 2021-10-06T20:48:24Z

Note to self:

implement syntax change at Storing intermediate results of a Composite Model MLJ.jl#841 (comment)

ablaom · 2021-10-27T02:36:18Z

Comment to trigger notification to self.

…ing-network-state

allow learning networks to export non-operation nodes (MLJ issue 841)

f564d96

ablaom marked this pull request as draft September 24, 2021 00:35

ablaom mentioned this pull request Sep 24, 2021

Storing intermediate results of a Composite Model JuliaAI/MLJ.jl#841

Open

ablaom and others added 2 commits September 24, 2021 12:51

fix some exceptions that should have been thrown but weren't

eb52e53

add test with multiple nodes information extraction

5a72f15

ablaom and others added 4 commits September 27, 2021 14:32

allow learning networks to export non-operation nodes (MLJ issue 841)

701fc39

fix some exceptions that should have been thrown but weren't

1a129e9

add test with multiple nodes information extraction

7af7750

change user interface point from fitted_params to report

107d7df

ablaom added 2 commits November 24, 2021 14:33

typo

7acf593

doc-string adjustments

f24089e

ablaom marked this pull request as ready for review December 20, 2021 03:06

ablaom added 2 commits December 20, 2021 16:23

more doc-string fixes

b30db88

Merge branch 'for-0-point-19-release' into exporting-additional-learn…

1727c85

…ing-network-state

ablaom changed the base branch from dev to for-0-point-19-release December 20, 2021 03:33

ablaom added 2 commits December 20, 2021 18:51

synatax change: machine(..., report=(itm1=node1, itm2-node2)) & tests

db7c05e

further tweaks to make replace() work

6b0f2cf

ablaom changed the title ~~New interface points for accessing the internal state of an exported learning network composite model~~ Add interface points for accessing the internal state of an exported learning network composite model Dec 20, 2021

ablaom mentioned this pull request Dec 20, 2021

For a 0.19 release #665

Closed

18 tasks

update machine(...) doc-string

0ec5472

ablaom mentioned this pull request Dec 23, 2021

Add entry to manual explaining new interface point for exported learning networks. JuliaAI/MLJ.jl#875

Closed

ablaom merged commit be39c8b into for-0-point-19-release Dec 23, 2021

ablaom deleted the exporting-additional-learning-network-state branch December 23, 2021 04:47

This was referenced Dec 23, 2021

Issue to trigger releases #345

Closed

For 0.17 release JuliaAI/MLJ.jl#864

Closed

Issue to trigger new releases JuliaAI/MLJ.jl#571

Closed

JuliaRegistrator mentioned this pull request Dec 29, 2021

New version: MLJ v0.17.0 JuliaRegistries/General#51365

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add interface points for accessing the internal state of an exported learning network composite model #644

Add interface points for accessing the internal state of an exported learning network composite model #644

ablaom commented Sep 24, 2021 •

edited

Loading

codecov-commenter commented Sep 24, 2021 •

edited

Loading

olivierlabayle commented Sep 24, 2021

ablaom commented Sep 25, 2021

ablaom commented Sep 26, 2021

olivierlabayle commented Sep 26, 2021

olivierlabayle commented Sep 26, 2021

ablaom commented Sep 26, 2021

olivierlabayle commented Sep 26, 2021

ablaom commented Oct 6, 2021

ablaom commented Oct 27, 2021

Add interface points for accessing the internal state of an exported learning network composite model #644

Add interface points for accessing the internal state of an exported learning network composite model #644

Conversation

ablaom commented Sep 24, 2021 • edited Loading

codecov-commenter commented Sep 24, 2021 • edited Loading

Codecov Report

olivierlabayle commented Sep 24, 2021

ablaom commented Sep 25, 2021

ablaom commented Sep 26, 2021

olivierlabayle commented Sep 26, 2021

olivierlabayle commented Sep 26, 2021

ablaom commented Sep 26, 2021

olivierlabayle commented Sep 26, 2021

ablaom commented Oct 6, 2021

ablaom commented Oct 27, 2021

ablaom commented Sep 24, 2021 •

edited

Loading

codecov-commenter commented Sep 24, 2021 •

edited

Loading