Skip to content

Commit

Permalink
Merge pull request #468 from mv1388/experiment-documentation
Browse files Browse the repository at this point in the history
Experiment documentation
  • Loading branch information
mv1388 committed Apr 12, 2020
2 parents 69135d1 + e2b32e5 commit b9246a5
Show file tree
Hide file tree
Showing 8 changed files with 200 additions and 3 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ def prepare_results_dict(self):
Mostly this consists of executing calculation of selected performance metrics and returning their result dicts.
If you want to use multiple performance metrics you have to combine them in the single self.results_dict
at the end by doing this:
self.results_dict = {**metric_dict_1, **metric_dict_2}
return {**metric_dict_1, **metric_dict_2}
Returns:
dict: calculated result dict
Expand Down
22 changes: 22 additions & 0 deletions docs/source/experiment.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,28 @@
experiment
==========

:mod:`aitoolbox.experiment` defines the experiment tracking and performance evaluation components. Because all
implemented components are completely independent from the TrainLoop engine they can be used either on their own
in a more manual mode or as part of the TrainLoop functionality available in :mod:`aitoolbox.torchtrain`. Due to the
independence of the components, certain elements, for performance evaluation can even be utilized for evaluation of
non-PyTorch models.

In general, :mod:`aitoolbox.experiment` helps the user with the following:

* Structured and reusable performance evaluation logic definition
* :mod:`aitoolbox.experiment.result_package`
* :mod:`aitoolbox.experiment.core_metrics`
* Tracked training performance history primitive
* :mod:`aitoolbox.experiment.training_history`
* High level experiment tracking API
* :mod:`aitoolbox.experiment.experiment_saver`
* :mod:`aitoolbox.experiment.local_experiment_saver`
* Low level experiment tracking primitives for model saving and performance results saving
* :mod:`aitoolbox.experiment.local_save`
* Saved model re-loading low level primitives
* :mod:`aitoolbox.experiment.local_load.local_model_load`


.. toctree::
:maxdepth: 1
:caption: Guides:
Expand Down
165 changes: 165 additions & 0 deletions docs/source/experiment/result_package.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,169 @@
Result Package
==============

Result Package found in :mod:`aitoolbox.experiment.result_package` defines a set of evaluation metrics that are
used for the performance evaluation of the model on a certain ML task. For example, in the simple classification task,
the corresponding result package would include metrics such as *accuracy*, *F1 score*, *ROC-AUC* and *PR-AUC*.
Result packages can thus be thought of as wrappers around a set of evaluation metrics commonly used for different
ML tasks.

The same as for all other components of :mod:`aitoolbox.experiment` module, when it comes to the usage of result
packages, they can be either used in a standalone manually executed fashion for any kind of ML experiment evaluation.
On the other hand, result packages can also be used in unison with the TrainLoop model training engine from
the :mod:`aitoolbox.torchtrain`. There, the result package assumes the role of the *evaluation recipe* for a certain ML
task. By providing the result package to the TrainLoop the user informs it how to automatically evaluate
the model performance during or at the end of the training process.


Using Result Packages
---------------------

Result Package implementations can be found in the :mod:`aitoolbox.experiment.result_package`. AIToolbox already comes
with result packages for various popular ML tasks included out of the box. These can be found in the
:mod:`aitoolbox.experiment.result_package.basic_packages`.


Result Package with torchtrain TrainLoop
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

When using result packages as part of TrainLoop supported training there are two main use-cases: as part
of the :class:`aitoolbox.torchtrain.callbacks.performance_eval.ModelPerformanceEvaluation` callback which optionally
performs the model performance evaluation during the training, e.g. after each epoch, and on the other hand as part of
the *"EndSave"* TrainLoop which automatically evaluates model's performance based on the provided result package at
the end of the training.

.. code-block:: python
from aitoolbox.torchtrain.train_loop import *
from aitoolbox.experiment.result_package.basic_packages import ClassificationResultPackage
from aitoolbox.torchtrain.callbacks.performance_eval import \
ModelPerformanceEvaluation, ModelPerformancePrintReport
hyperparams = {
'lr': 0.001,
'betas': (0.9, 0.999)
}
model = CNNModel() # TTModel based neural model
train_loader = DataLoader(...)
val_loader = DataLoader(...)
test_loader = DataLoader(...)
optimizer = optim.Adam(model.parameters(), lr=hyperparams['lr'], betas=hyperparams['betas'])
criterion = nn.NLLLoss()
callbacks = [ModelPerformanceEvaluation(ClassificationResultPackage(), hyperparams,
on_train_data=True, on_val_data=True),
ModelPerformancePrintReport(['train_Accuracy', 'val_Accuracy'])]
tl = TrainLoopCheckpointEndSave(
model,
train_loader, val_loader, test_loader,
optimizer, criterion,
project_name='train_loop_examples',
experiment_name='result_package_with_trainloop_example',
local_model_result_folder_path='results_dir',
hyperparams=hyperparams,
val_result_package=ClassificationResultPackage(),
test_result_package=ClassificationResultPackage()
)
model = tl.fit(num_epochs=10)
Standalone Result Package Use
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

As mentioned above, result packages are completely independent from TrainLoop engine and can thus also be used
for a standalone model performance evaluation, even when not dealing with PyTorch models

.. code-block:: python
from aitoolbox.experiment.result_package.basic_packages import BinaryClassificationResultPackage
y_true = ... # ground truth labels
y_predicted = ... # predicted by the model
result_pkg = BinaryClassificationResultPackage()
result_pkg.prepare_result_package(y_true, y_predicted)
# get the results dict with performance results of all the metrics in the result package
performance_results = result_pkg.get_results()
Implementing New Result Packages
--------------------------------

Although AIToolbox already provides result packages for certain ML tasks sometimes the user wants do define a novel or
unsupported performance evaluation metrics to properly evaluate the ML task at hand. The creation of new result packages
in AIToolbox is supported and can be done very easily.

The new result package can be implemented as a new class which is inheriting from the base abstract result package
:class:`aitoolbox.experiment.result_package.abstract_result_packages.AbstractResultPackage` and implements
the abstract method :meth:`aitoolbox.experiment.result_package.abstract_result_packages.AbstractResultPackage.prepare_results_dict`.

Inside the ``prepare_results_dict()`` the user needs to implement the logic to evaluate the performance on desired
performance metrics forming the result package. In order to perform the evaluation the predicted and ground truth values
are normally needed. These are inserted into the package at run time (via ``prepare_result_package()``) and
exposed inside the result package via: ``self.y_true`` and ``self.y_predicted`` attributes. Logic inside the which
the user needs to define, ``prepare_results_dict()`` should access the values in *y_true* and *y_predicted*,
pass them through the desired performance metrics computations and return the results in the dict form.
Inside the returned dict, keys should represent the evaluated metric names and values the corresponding
evaluated performance metric values.

The performance metric computation as part of the result package can be directly implemented inside the result package class
in the ``prepare_results_dict()`` method. However, especially in the case of more complex performance metric logic
in order to ensure better reusability of the implemented metrics as well as more readable and structured code of
the developed result packages it is common practice in the AIToolbox to implement performance metrics as a separate
specialized metric class. This way the result packages become a lightweight wrappers around the selected performance
metrics while the actual performance metric logic and calculation is done as part of the metric object instead of
being done in the encapsulating result package. To learn more about the AIToolbox performance metric use and
implementations have a look at the :doc:`metrics` documentation section.


Example or Result Package using AIToolbox Metric
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. code-block:: python
from aitoolbox.experiment.result_package.abstract_result_packages import AbstractResultPackage
from aitoolbox.experiment.core_metrics.classification import \
AccuracyMetric, ROCAUCMetric, PrecisionRecallCurveAUCMetric
class ExampleClassificationResultPackage(AbstractResultPackage):
def __init__(self):
AbstractResultPackage.__init__(self, pkg_name='ExampleClassificationResult')
def prepare_results_dict(self):
accuracy_result = AccuracyMetric(self.y_true, self.y_predicted)
roc_auc_result = ROCAUCMetric(self.y_true, self.y_predicted)
pr_auc_result = PrecisionRecallCurveAUCMetric(self.y_true, self.y_predicted)
return accuracy_result + roc_auc_result + pr_auc_result
Example of Result Package with Direct Performance Metric Calculation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. code-block:: python
from aitoolbox.experiment.result_package.abstract_result_packages import AbstractResultPackage
from sklearn.metrics import accuracy_score, roc_auc_score, precision_recall_curve, auc
class ExampleClassificationResultPackage(AbstractResultPackage):
def __init__(self):
AbstractResultPackage.__init__(self, pkg_name='ExampleClassificationResult')
def prepare_results_dict(self):
accuracy = accuracy_score(self.y_true, self.y_predicted)
roc_auc = roc_auc_score(self.y_true, self.y_predicted)
precision, recall, thresholds = precision_recall_curve(self.y_true, self.y_predicted)
pr_auc = auc(recall, precision)
return {'accuracy': accuracy, 'roc_auc': roc_auc, 'pr_auc': pr_auc}
2 changes: 2 additions & 0 deletions docs/source/torchtrain/apex_training.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ Example of initialization is shown bellow and more can be read in the
from apex import amp
from aitoolbox.torchtrain.train_loop import *
train_loader = DataLoader(...)
val_loader = DataLoader(...)
test_loader = DataLoader(...)
Expand Down Expand Up @@ -57,6 +58,7 @@ approach (DDP is currently only multi-GPU training setup supported by Apex AMP).
from aitoolbox.torchtrain.train_loop import *
train_loader = DataLoader(...)
val_loader = DataLoader(...)
test_loader = DataLoader(...)
Expand Down
5 changes: 3 additions & 2 deletions docs/source/torchtrain/callbacks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ Example of the several basic callbacks used to infuse additional logic into the
from aitoolbox.torchtrain.train_loop import *
from aitoolbox.torchtrain.callbacks.basic import EarlyStopping, TerminateOnNaN, AllPredictionsSame
model = CNNModel() # TTModel based neural model
train_loader = DataLoader(...)
val_loader = DataLoader(...)
Expand All @@ -54,8 +55,8 @@ For a full working example which shows the use of multiple callbacks of various
<https://github.com/mv1388/aitoolbox/blob/master/examples/TrainLoop_use/trainloop_fully_tracked_experiment.py#L81>`_.


Developing New Callbacks
------------------------
Implementing New Callbacks
--------------------------

However when some completely new functionality is desired which is not available out of the box in AIToolbox
the user can also implement their own custom callbacks. These can then be used as any other callback to further
Expand Down
1 change: 1 addition & 0 deletions docs/source/torchtrain/model.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ the TrainLoop:
from aitoolbox.torchtrain.model import TTModel
class MyNeuralModel(TTModel):
def __init__(self):
# model layers, etc.
Expand Down
2 changes: 2 additions & 0 deletions docs/source/torchtrain/parallel.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ core *PyTorch* with *DataParallel*:
from aitoolbox.torchtrain.train_loop import *
from aitoolbox.torchtrain.parallel import TTDataParallel
model = CNNModel() # TTModel based neural model
model = TTDataParallel(model)
Expand Down Expand Up @@ -60,6 +61,7 @@ otherwise when not training distributed).
from aitoolbox.torchtrain.train_loop import *
model = CNNModel() # TTModel based neural model
train_loader = DataLoader(...)
Expand Down
4 changes: 4 additions & 0 deletions docs/source/torchtrain/train_loop.rst
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ Example of the ``TrainLoop`` used to train the model:
from aitoolbox.torchtrain.train_loop import *
model = CNNModel() # TTModel based neural model
train_loader = DataLoader(...)
val_loader = DataLoader(...)
Expand Down Expand Up @@ -89,6 +90,7 @@ The API can be found in: :class:`aitoolbox.torchtrain.train_loop.TrainLoopCheckp
from aitoolbox.torchtrain.train_loop import *
from aitoolbox.experiment.result_package.basic_packages import ClassificationResultPackage
hyperparams = {
'lr': 0.001,
'betas': (0.9, 0.999)
Expand Down Expand Up @@ -133,6 +135,7 @@ section.
from aitoolbox.torchtrain.train_loop import *
from aitoolbox.experiment.result_package.basic_packages import ClassificationResultPackage
hyperparams = {
'lr': 0.001,
'betas': (0.9, 0.999)
Expand Down Expand Up @@ -183,6 +186,7 @@ For a full working example of the ``TrainLoopCheckpointEndSave`` training, check
from aitoolbox.torchtrain.train_loop import *
from aitoolbox.experiment.result_package.basic_packages import ClassificationResultPackage
hyperparams = {
'lr': 0.001,
'betas': (0.9, 0.999)
Expand Down

0 comments on commit b9246a5

Please sign in to comment.