Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What's wrong with the way we export learning networks as new model types #831

Closed
ablaom opened this issue Aug 29, 2022 · 1 comment
Closed
Assignees

Comments

@ablaom
Copy link
Member

ablaom commented Aug 29, 2022

The learning networks API appeared in the very earlier versions of MLJ and over time has grown to become a bit of a hack, in response to a great many feature requests. Here are some complaints:

  • From the user's point of view, the process for exporting a learning network is too complicated (eg, first define learning network machine, etc)
  • It is also too mysterious (too much hidden knowledge)
  • From the developer's point of view, it is also too complicated and too mysterious making it a challenge to maintain and enhance, even for those very familiar with it.
  • There exist unnatural restrictions on what a user can do, such as one that led to Prohibit distinct fields in composite models pointing to two models that are === #377. This becomes worse if we move to allowing immutable model structs (for, eg, integration with TableTransforms.jl)
  • Retraining a composite is not "smart" if a component model is replaced (only if it is mutated)
  • The fact that fitted_params(::Machine) and report(::Machine) need special casing for machines bound to composite models feels like an unnecessary complication and is making it difficult to design uniform interfaces; see eg, Storing intermediate results of a Composite Model MLJ.jl#841 (comment).

These issues do not just affect esoteric applications to model composition but are holding back some important developments, eg, in outlier detection, and TableTransforms.jl integration.

The root causes for these issues are:

1. The way we currently establish a mapping from composite model hyperparameters for component models to the corresponding machines in the network that are meant to point to those component models.

2. The fact that we prematurely merge reports from fit with reports from an operation, instead of keeping them separate in the machine and providing a model-specific method to say how to combine them.

I've had a pretty good idea on how to resolve 1, but it will take a little time to put together. Fixing 2 after 1 will be easier, and what is needed there is more-or-less obvious. Stay tuned for PR's to address these. Thank you for your patience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant