Training State Centered Framework vs. Engine Centered Architecture

Hi!

I was thinking for some time about the current architecture of Ignite and how to improve the workflow both, during application development (writing training scripts) and during feature development. When writing training scripts I could not code the individual training I wanted and when trying to code the feature for my training script I ended up writing infrastructure/architecture code instead of implementing the feature. There were some kind of restrictions I first couldn't identify...

Nevertheless, after a while I found 2 twists, actually nothing big... so here I want to come up with a under-the-hood-framework to integrate into Ignite that solved all my problems and many open issues! Nice right?

With this framework integrated into Ignite you achieve an extreme nice overview during debugging, enhance Ignite to a rapid feature dev tool, can handle far more complex (individual) use cases while achieving a higher degree of automation at the same time and have quite some new features and many more possibilities for more syntactic sugar.

But now comes the... BUT as I tried to fix it, unfortunately I had to realize it won't work without major revisions. For that I went a long way to really provide proof and facts - something you can play with to make up your opinion - before daring to suggest a major revision...

So, if you're interested in a up&running "what-if-when"-Ignite version with the 2 twists below untwisted, please have a look at the [repository](https://github.com/DrStoop/ignite_framework) and the [documentation](https://drstoop.github.io/ignite_framework/index.html#a-framework-for-ignite) and leave me a feedback - I'd really like to know your opinion.

In case enough of you like it and could imagine integrating the framework into Ignite, I could pull/request the code on an experimental branch and we see how it goes from there. (Note: I just pushed it to another repository because as far as i know you cannot pull/request a new branch - which this definitely needs.)

Everything else you need to know you will find in the [Ignite Framework repo](https://github.com/DrStoop/ignite_framework) and the [docu](https://drstoop.github.io/ignite_framework/index.html#a-framework-for-ignite). For bugs & questions, let me know, thx!

So, set up your first coffee & enjoy playing!


### Teaser from the [documentation](https://drstoop.github.io/ignite_framework/index.html#a-framework-for-ignite)

#### Two issues

I am a fan of Ignite and that's why I'm trying to contribute, but I discovered 2 shortcomings in the architecture and the implementation, that caused me quite some restrictions and coding infrastructure instead of programming new features (what I actually wanted to do). The issues are:

* **Engine centered architecture**: In current Ignite the `Engine` is the architectural center with the training `state` as attribute. The training `state` atttribute is a transient object that is only instantiated when the `Engine` is in `run`-mode and vanishes afterwards. Also the `state` holds only a selective fraction of all variables and parameters that make up the real training state. So `Engine` is a kind of static object and `state` is transient. This does not represent the reality of the training process. In reality the training starts with an initial state holding all variables, parameters including e.g. model variables, hyperparameters etc. which then are modified while the state goes through different transitions. The main transitions of the `state` are `Engine`s (normally more than one). **So the state should be the architectual center holding ALL variables, parameters, values, transitions etc. and the Engine is (just) the main trainsition of the state.** This small twist causes quite some complications for features and APIs which are listed below.
* **Event is broken in many pieces**: Currently an `Event` is an `Enum` that has to be explicitly `fire_event`ed, and implicitly `_fire_event`ed so the `event_handlers` handle further callbacks. The `Event` is always fired after some other training value has changed, e.g. the model output was updated `ITERATION_COMPLETED` is fired. Also if you want to fire a non-standard event, you first have to create it, register it at each `Engine` that is supposed to use it and then the firing has to be implemented... But in reality an **Event is nothing more than a value change of a training state variable that triggers callbacks**. So all these pieces above can be put together by implementing a state variable as a descriptor.



#### Improvements from an underlying framework


You will experience the improvements given by the framework when working on all 3 levels: application implementation, feature development and framework development. The separation of these working areas is already the first improvement. Try out the benifits in detail & hands-on for the first to levels in the [Quickunderstanding Application](https://drstoop.github.io/ignite_framework/quickunderstanding_app.html) and [Quickunderstanding Feature Dev](https://drstoop.github.io/ignite_framework/quickunderstanding_feature_dev.html).

Before you go through the theoretically described enhancements these few `no-comment-teasers` of the training ``state`` in the debugger will give you nice insights what's ahead. It shows the Ignite example [mnist_with_tensorboard.py](https://github.com/pytorch/ignite/blob/master/examples/mnist/mnist_with_tensorboardx.py) [transferred to the framework architecture](https://drstoop.github.io/ignite_framework/examples/mnist_with_tensorboard_logger_and_high_level_apis.html)  just before the _engines are started_:

![teaser_state_in_debugger](https://user-images.githubusercontent.com/19177740/75343252-b56c3280-58a0-11ea-8e74-6eee18896a6c.png)

And here the engine ``state.engines.trainer`` unfolded:

![teaser_engine_in_debugger](https://user-images.githubusercontent.com/19177740/75343291-cb79f300-58a0-11ea-9480-0be4a709193c.png)

Or setting up all the below Tensorboard charts with these two simple comands:

```python
# Automatically identify and generate metric chats comparing the different engines
EnginesMetricsComparisonCharts(x_axis_ref=state.trainer.n_samples_ref, n_identical_metric_name_suffixes=1)
# Automatically generate for each engine a summary of all metric charts
EnginesMetricsCharts(x_axes_refs=state.trainer.n_samples_ref, n_identical_metric_name_suffixes=1)
```
By the way, if you had set up 10x more metrics and some more engines, these two command would not change to provide all comparative and single metric charts of all engines.

![teaser_tensorboard](https://user-images.githubusercontent.com/19177740/75343351-f82e0a80-58a0-11ea-8bdc-057d21a598e2.png)


Soooo, if you're intrested, then grab a coffee and press [>>>PLAY<<<](https://drstoop.github.io/ignite_framework/index.html)!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Training State Centered Framework vs. Engine Centered Architecture #810

Teaser from the documentation

Two issues

Improvements from an underlying framework

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Training State Centered Framework vs. Engine Centered Architecture #810

Description

Teaser from the documentation

Two issues

Improvements from an underlying framework

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions