Engine + core/cli re-design #359

NickleDave · 2021-05-15T19:12:36Z

edit: this issue was originally about considering backends but I am hijacking it to collect my thoughts on how to (re)design everything

it would really be nice to not be in the business of maintaining a deep learning library, esp since whole teams of ppl are doing that already

would prefer scope for vak to be:

reference implementations of algos specific to vocal learning community
associated tools for benchmarking (e.g., WindowDataset)

question is what backends could be used that would handle training / eval / etc.
Starting this issue to keep track of thoughts

current list in my head:

https://github.com/explosion/thinc
- pros:
  - framework agnostic, can use torch or TF <-- for me a big pro
  - already used for spacy
  - plays well with FastAPI, pydantic, and friends
  - TOML-like configuration file format (I think, need to check back)
- (possible) cons:
  - don't have a feeling for how easily abstractions will work outside of NLP. Need to experiment with this in a branch
https://github.com/speechbrain/speechbrain
- pros:
  - appear to be related abstractions we could use, e.g. loss functions
- cons:
  - pytorch only
  - YAML config format, gross

The text was updated successfully, but these errors were encountered:

NickleDave · 2021-12-01T13:54:04Z

Still thinking about this.

rn feeling like the best compromise is still to write a very lightweight keras-like API for pytorch

mainly because we need the dataset tooling from torch, and we don't need all the massive lumbering technical debt of tf
https://www.youtube.com/watch?v=XHyASP49ses

the best way forward I think will be to write examples of what the interface should look like, then do the refactor around that. From Design Patterns (as quoted in Fluent Python):

“Program to an interface, not an implementation” and “Favor object composition over class inheritance.”

Basically the Engine class should be refactored to accept callbacks (as described in #405 ).
Logic for looping over multiple models, if kept, should be moved "up" from the core methods into the cli methods

Would be nice to provide a pytorch_lightning like Model class too as described in #406 .
I like the idea of declaring "this is a model" in code. I do not like the verbosity of pytorch_lightning

other refactoring notes:
The main thing I know right now is there are way too many conditionals within both core and cli methods.
A hand-wired conditional is just a hidden interface crying out: https://www.youtube.com/watch?v=OMPfEXIlTVE

The key idea for any cli function should be "what is the most common / generic workflow for a user; capture that in code".
So if it doesn't look very much like something a user would want to write, re-factoring / abstraction needs to happen

NickleDave · 2022-07-01T02:49:32Z

So package structure will be something like this

vak.models  # <-- vak.models.sed.tweetynet, vak.models.gen.ava. or whatever schema makes sense
vak.datasets  # <-- bfrepo, etc.
vak.transforms  # clip, denoise, etc.,
vak.engine  # vak.engine.Engine.train, vak.engine.Engine.eval, etc.

as in #207 and #405

So the models module/sub-package will be equivalent to torchvision.models with models for task 1, task 2, etc., and then engine will live in its own separate module

NickleDave · 2022-07-08T01:42:17Z

See also #536 -- Model should be an attrs class with: {network, loss, optimizer, metrics}, having defaults for each.
This can be used in each module in vak.models, i.e. there will be a TweetyNetModel in vak/models/sed/tweetynet.py

This is preferable to making a new Model class that users have to subclass as proposed in #406 -- the idea is actually similar (in that I called it an "interface")

NickleDave · 2022-07-08T02:15:16Z

Note that whatever we do should fix #362 -- this is the core problem to address here

By making it an attrs model with defaults we can easily instantiate the model by just saying ModelDataClass()

we can have a vak.models.register decorator similar to the one in crowsetta for formats

NickleDave · 2022-11-23T19:59:56Z

Picking this up again.
The order of operations needs to be:

first, implement new Model class/decorator as in ENH: change how Model is declared to remove sub-classing of vak.engine.Model #536 to fix make it easier to instantiate a model #362, because the new engine will expect a Model of this type as one of its arguments
- I am working out how to implement this, will write thoughts down in related issues. I think I will need two things:
  - a decorator, as in ENH: change how Model is declared to remove sub-classing of vak.engine.Model #536, that a user will apply to their own class, which will just have class attributes similar to a dataclass
  - and then a separate Model class as in add a ModelDefinition class #406 with methods forward and __call__, that addresses make it easier to instantiate a model #362 by making it possible to forward a tensor through the Model's net and get a tensor out, and also to get an optional output with some sort of post-processing applied by just __call__ing the Model itself
Then, we can implement new engine with callbacks including default_train, default_predict, etc., as in ENH/CLN: replace sub-classing Engine/Model methods with callbacks / classes #405

NickleDave · 2023-01-22T02:58:21Z

Closed by #605

NickleDave added the ENH: enhancement enhancement; new feature or request label May 15, 2021

NickleDave changed the title ~~consider options for a "backend"~~ version 0.5 refactor Dec 1, 2021

NickleDave added the DEV: development development, not source code: e.g. change dependencies, bump version label Dec 1, 2021

NickleDave pinned this issue Dec 1, 2021

NickleDave changed the title ~~version 0.5 refactor~~ version 0.5 design Dec 1, 2021

NickleDave added this to To Do in DEV (roadmap) Dec 1, 2021

NickleDave mentioned this issue Jan 13, 2022

add windows specific install instructions, may need Microsoft C++ installed #262

Closed

NickleDave changed the title ~~version 0.5 design~~ Engine + core/cli re-design Mar 20, 2022

NickleDave self-assigned this Mar 20, 2022

NickleDave moved this from To Do to In progress in DEV (roadmap) Aug 23, 2022

NickleDave mentioned this issue Dec 2, 2022

ENH: Switch to lightning as backend, remove engine package #597

Closed

3 tasks

NickleDave mentioned this issue Dec 25, 2022

Add abstractions to make it easier to declare and instantiate models #605

Merged

NickleDave closed this as completed Jan 22, 2023

NickleDave unpinned this issue Jan 22, 2023

NickleDave mentioned this issue Jan 22, 2023

DOC: Finish documenting new model + family abstractions #616

Open

2 tasks

NickleDave moved this from In progress to Done in DEV (roadmap) Jan 22, 2023

This was referenced Feb 12, 2023

ENH: Add decorator to register models, in vak.models #623

Closed

CLN: Remove unused 'multiple model' functionality #538

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Engine + core/cli re-design #359

Engine + core/cli re-design #359

NickleDave commented May 15, 2021 •

edited

NickleDave commented Dec 1, 2021 •

edited

NickleDave commented Jul 1, 2022 •

edited

NickleDave commented Jul 8, 2022 •

edited

NickleDave commented Jul 8, 2022

NickleDave commented Nov 23, 2022 •

edited

NickleDave commented Jan 22, 2023

Engine + core/cli re-design #359

Engine + core/cli re-design #359

Comments

NickleDave commented May 15, 2021 • edited

NickleDave commented Dec 1, 2021 • edited

NickleDave commented Jul 1, 2022 • edited

NickleDave commented Jul 8, 2022 • edited

NickleDave commented Jul 8, 2022

NickleDave commented Nov 23, 2022 • edited

NickleDave commented Jan 22, 2023

NickleDave commented May 15, 2021 •

edited

NickleDave commented Dec 1, 2021 •

edited

NickleDave commented Jul 1, 2022 •

edited

NickleDave commented Jul 8, 2022 •

edited

NickleDave commented Nov 23, 2022 •

edited