Skip to content
Permalink
Browse files

updated early stopping documentation

  • Loading branch information...
lukostaz committed Apr 18, 2019
1 parent 0f23d6a commit 5a418088c4577f9a01c4e60941b9aa2f89ccf46f
Showing with 146 additions and 7 deletions.
  1. +30 −3 ampligraph/evaluation/protocol.py
  2. +116 −4 ampligraph/latent_features/models.py
@@ -773,8 +773,7 @@ def select_best_model_ranking(model_class, X, param_grid, use_filter=False, earl
The function also retrains the best performing model on the concatenation of training and validation sets.
(note that we generate negatives at runtime according to the strategy described
in ::cite:`bordes2013translating`).
Note we generate negatives at runtime according to the strategy described in ::cite:`bordes2013translating`).
Parameters
----------
@@ -789,7 +788,35 @@ def select_best_model_ranking(model_class, X, param_grid, use_filter=False, earl
use_filter : bool
If True, will use the entire input dataset X to compute filtered MRR
early_stopping: bool
Flag to enable early stopping(default:False)
Flag to enable early stopping (default:False).
If set to ``True``, the training loop adopts the following early stopping heuristic:
- The model will be trained regardless of early stopping for ``burn_in`` epochs.
- Every ``check_interval`` epochs the method will compute the metric specified in ``criteria``.
If such metric decreases for ``stop_interval`` checks, we stop training early.
Note the metric is computed on ``x_valid``. This is usually a validation set that you held out.
Also, because ``criteria`` is a ranking metric, it requires generating negatives.
Entities used to generate corruptions can be specified, as long as the side(s) of a triple to corrupt.
The method supports filtered metrics, by passing an array of positives to ``x_filter``. This will be used to
filter the negatives generated on the fly (i.e. the corruptions).
.. note::
Keep in mind the early stopping criteria may introduce a certain overhead
(caused by the metric computation).
The goal is to strike a good trade-off between such overhead and saving training epochs.
A common approach is to use MRR unfiltered: ::
early_stopping_params={x_valid=X['valid'], 'criteria': 'mrr'}
Note the size of validation set also contributes to such overhead.
In most cases a smaller validation set would be enough.
early_stopping_params: dict
Dictionary of parameters for early stopping.
@@ -1315,7 +1315,35 @@ def fit(self, X, early_stopping=False, early_stopping_params={}):
X : ndarray, shape [n, 3]
The training triples
early_stopping: bool
Flag to enable early stopping (default:``False``)
Flag to enable early stopping (default:False).
If set to ``True``, the training loop adopts the following early stopping heuristic:
- The model will be trained regardless of early stopping for ``burn_in`` epochs.
- Every ``check_interval`` epochs the method will compute the metric specified in ``criteria``.
If such metric decreases for ``stop_interval`` checks, we stop training early.
Note the metric is computed on ``x_valid``. This is usually a validation set that you held out.
Also, because ``criteria`` is a ranking metric, it requires generating negatives.
Entities used to generate corruptions can be specified, as long as the side(s) of a triple to corrupt.
The method supports filtered metrics, by passing an array of positives to ``x_filter``. This will be used to
filter the negatives generated on the fly (i.e. the corruptions).
.. note::
Keep in mind the early stopping criteria may introduce a certain overhead
(caused by the metric computation).
The goal is to strike a good trade-off between such overhead and saving training epochs.
A common approach is to use MRR unfiltered: ::
early_stopping_params={x_valid=X['valid'], 'criteria': 'mrr'}
Note the size of validation set also contributes to such overhead.
In most cases a smaller validation set would be enough.
early_stopping_params: dictionary
Dictionary of hyperparameters for the early stopping heuristics.
@@ -1547,7 +1575,35 @@ def fit(self, X, early_stopping=False, early_stopping_params={}):
X : ndarray, shape [n, 3]
The training triples
early_stopping: bool
Flag to enable early stopping (default:``False``)
Flag to enable early stopping (default:False).
If set to ``True``, the training loop adopts the following early stopping heuristic:
- The model will be trained regardless of early stopping for ``burn_in`` epochs.
- Every ``check_interval`` epochs the method will compute the metric specified in ``criteria``.
If such metric decreases for ``stop_interval`` checks, we stop training early.
Note the metric is computed on ``x_valid``. This is usually a validation set that you held out.
Also, because ``criteria`` is a ranking metric, it requires generating negatives.
Entities used to generate corruptions can be specified, as long as the side(s) of a triple to corrupt.
The method supports filtered metrics, by passing an array of positives to ``x_filter``. This will be used to
filter the negatives generated on the fly (i.e. the corruptions).
.. note::
Keep in mind the early stopping criteria may introduce a certain overhead
(caused by the metric computation).
The goal is to strike a good trade-off between such overhead and saving training epochs.
A common approach is to use MRR unfiltered: ::
early_stopping_params={x_valid=X['valid'], 'criteria': 'mrr'}
Note the size of validation set also contributes to such overhead.
In most cases a smaller validation set would be enough.
early_stopping_params: dictionary
Dictionary of hyperparameters for the early stopping heuristics.
@@ -1794,7 +1850,35 @@ def fit(self, X, early_stopping=False, early_stopping_params={}):
X : ndarray, shape [n, 3]
The training triples
early_stopping: bool
Flag to enable early stopping (default:``False``)
Flag to enable early stopping (default:False).
If set to ``True``, the training loop adopts the following early stopping heuristic:
- The model will be trained regardless of early stopping for ``burn_in`` epochs.
- Every ``check_interval`` epochs the method will compute the metric specified in ``criteria``.
If such metric decreases for ``stop_interval`` checks, we stop training early.
Note the metric is computed on ``x_valid``. This is usually a validation set that you held out.
Also, because ``criteria`` is a ranking metric, it requires generating negatives.
Entities used to generate corruptions can be specified, as long as the side(s) of a triple to corrupt.
The method supports filtered metrics, by passing an array of positives to ``x_filter``. This will be used to
filter the negatives generated on the fly (i.e. the corruptions).
.. note::
Keep in mind the early stopping criteria may introduce a certain overhead
(caused by the metric computation).
The goal is to strike a good trade-off between such overhead and saving training epochs.
A common approach is to use MRR unfiltered: ::
early_stopping_params={x_valid=X['valid'], 'criteria': 'mrr'}
Note the size of validation set also contributes to such overhead.
In most cases a smaller validation set would be enough.
early_stopping_params: dictionary
Dictionary of hyperparameters for the early stopping heuristics.
@@ -2018,7 +2102,35 @@ def fit(self, X, early_stopping=False, early_stopping_params={}):
X : ndarray, shape [n, 3]
The training triples
early_stopping: bool
Flag to enable early stopping (default:``False``)
Flag to enable early stopping (default:False).
If set to ``True``, the training loop adopts the following early stopping heuristic:
- The model will be trained regardless of early stopping for ``burn_in`` epochs.
- Every ``check_interval`` epochs the method will compute the metric specified in ``criteria``.
If such metric decreases for ``stop_interval`` checks, we stop training early.
Note the metric is computed on ``x_valid``. This is usually a validation set that you held out.
Also, because ``criteria`` is a ranking metric, it requires generating negatives.
Entities used to generate corruptions can be specified, as long as the side(s) of a triple to corrupt.
The method supports filtered metrics, by passing an array of positives to ``x_filter``. This will be used to
filter the negatives generated on the fly (i.e. the corruptions).
.. note::
Keep in mind the early stopping criteria may introduce a certain overhead
(caused by the metric computation).
The goal is to strike a good trade-off between such overhead and saving training epochs.
A common approach is to use MRR unfiltered: ::
early_stopping_params={x_valid=X['valid'], 'criteria': 'mrr'}
Note the size of validation set also contributes to such overhead.
In most cases a smaller validation set would be enough.
early_stopping_params: dictionary
Dictionary of hyperparameters for the early stopping heuristics.

0 comments on commit 5a41808

Please sign in to comment.
You can’t perform that action at this time.