-
-
Notifications
You must be signed in to change notification settings - Fork 657
Description
Hi!
I was thinking for some time about the current architecture of Ignite and how to improve the workflow both, during application development (writing training scripts) and during feature development. When writing training scripts I could not code the individual training I wanted and when trying to code the feature for my training script I ended up writing infrastructure/architecture code instead of implementing the feature. There were some kind of restrictions I first couldn't identify...
Nevertheless, after a while I found 2 twists, actually nothing big... so here I want to come up with a under-the-hood-framework to integrate into Ignite that solved all my problems and many open issues! Nice right?
With this framework integrated into Ignite you achieve an extreme nice overview during debugging, enhance Ignite to a rapid feature dev tool, can handle far more complex (individual) use cases while achieving a higher degree of automation at the same time and have quite some new features and many more possibilities for more syntactic sugar.
But now comes the... BUT as I tried to fix it, unfortunately I had to realize it won't work without major revisions. For that I went a long way to really provide proof and facts - something you can play with to make up your opinion - before daring to suggest a major revision...
So, if you're interested in a up&running "what-if-when"-Ignite version with the 2 twists below untwisted, please have a look at the repository and the documentation and leave me a feedback - I'd really like to know your opinion.
In case enough of you like it and could imagine integrating the framework into Ignite, I could pull/request the code on an experimental branch and we see how it goes from there. (Note: I just pushed it to another repository because as far as i know you cannot pull/request a new branch - which this definitely needs.)
Everything else you need to know you will find in the Ignite Framework repo and the docu. For bugs & questions, let me know, thx!
So, set up your first coffee & enjoy playing!
Teaser from the documentation
Two issues
I am a fan of Ignite and that's why I'm trying to contribute, but I discovered 2 shortcomings in the architecture and the implementation, that caused me quite some restrictions and coding infrastructure instead of programming new features (what I actually wanted to do). The issues are:
- Engine centered architecture: In current Ignite the
Engine
is the architectural center with the trainingstate
as attribute. The trainingstate
atttribute is a transient object that is only instantiated when theEngine
is inrun
-mode and vanishes afterwards. Also thestate
holds only a selective fraction of all variables and parameters that make up the real training state. SoEngine
is a kind of static object andstate
is transient. This does not represent the reality of the training process. In reality the training starts with an initial state holding all variables, parameters including e.g. model variables, hyperparameters etc. which then are modified while the state goes through different transitions. The main transitions of thestate
areEngine
s (normally more than one). So the state should be the architectual center holding ALL variables, parameters, values, transitions etc. and the Engine is (just) the main trainsition of the state. This small twist causes quite some complications for features and APIs which are listed below. - Event is broken in many pieces: Currently an
Event
is anEnum
that has to be explicitlyfire_event
ed, and implicitly_fire_event
ed so theevent_handlers
handle further callbacks. TheEvent
is always fired after some other training value has changed, e.g. the model output was updatedITERATION_COMPLETED
is fired. Also if you want to fire a non-standard event, you first have to create it, register it at eachEngine
that is supposed to use it and then the firing has to be implemented... But in reality an Event is nothing more than a value change of a training state variable that triggers callbacks. So all these pieces above can be put together by implementing a state variable as a descriptor.
Improvements from an underlying framework
You will experience the improvements given by the framework when working on all 3 levels: application implementation, feature development and framework development. The separation of these working areas is already the first improvement. Try out the benifits in detail & hands-on for the first to levels in the Quickunderstanding Application and Quickunderstanding Feature Dev.
Before you go through the theoretically described enhancements these few no-comment-teasers
of the training state
in the debugger will give you nice insights what's ahead. It shows the Ignite example mnist_with_tensorboard.py transferred to the framework architecture just before the engines are started:
And here the engine state.engines.trainer
unfolded:
Or setting up all the below Tensorboard charts with these two simple comands:
# Automatically identify and generate metric chats comparing the different engines
EnginesMetricsComparisonCharts(x_axis_ref=state.trainer.n_samples_ref, n_identical_metric_name_suffixes=1)
# Automatically generate for each engine a summary of all metric charts
EnginesMetricsCharts(x_axes_refs=state.trainer.n_samples_ref, n_identical_metric_name_suffixes=1)
By the way, if you had set up 10x more metrics and some more engines, these two command would not change to provide all comparative and single metric charts of all engines.
Soooo, if you're intrested, then grab a coffee and press >>>PLAY<<<!