fix: Ensure eval mode for farm and transformer models for predictions #3791

sjrl · 2022-12-30T12:56:38Z

Related Issues

fixes N/A
Related to PR fix: Ensure eval mode for TableReader model for predictions #3743

Proposed Changes:

I added model.eval() calls to make sure to set the models in:

FARMReader
TransformerReader
EmbeddingRetriever (both with the _RetribertEmbeddingEncoder and _DefaultEmbeddingEncoder)
PromptNode
TransformersDocumentClassifier
TransformersTranslator
Text2SparqlRetriever
Text2Speech
EntityExtractor

are set to eval mode when running an inference prediction. Otherwise, currently, if the model is set to train mode the predictions become random due to layers like dropout, and BatchNormalization being present in the underlying model architecture. This is something we already do for some of our nodes like the DensePassageRetriever.

How did you test it?

I added unit tests that cover:

FARMReader and TransformerReader
EmbeddingRetriever
Text2Speech

to make sure the nodes provide correct results even if they are set to train mode before running a prediction. I did confirm that these unit tests fail without the new changes.

Notes for the reviewer

This probably was not noticed before because the model is set to eval mode by default when loading it. However, this error could have occurred when using the nodes right after training. For example, using the predict function of the FARMReader right after training.
This was not an issue when using a SentenceTransformer model since the SentenceTransformer.encode function sets the underlying PyTorch model to eval mode within the function call.

Checklist

I have read the contributors guidelines and the code of conduct
I have updated the related issue with new insights and changes
I added tests that demonstrate the correct behavior of the change
I've used the conventional commit convention for my PR title
I documented my code
I ran pre-commit hooks and fixed any issue

…dict or predict_batch. Added test that failed previously.

… unit test for TextToSpeech.

vblagoje

@sjrl looks good from the outset. Any other pipelines we use in the codebase? Let's not switch those explicitly to eval, as it's already done automatically.

haystack/nodes/prompt/prompt_node.py

vblagoje · 2023-01-09T15:50:25Z

@sjrl wait wait, I don't agree we call the eval() switch on every model invocation. That's excessive; why do that? Take the example of an encoder, we need to encode millions of documents, and yet we'll call eval on every call. I would respectfully disagree.

sjrl · 2023-01-09T16:08:22Z

wait wait, I don't agree we call the eval() switch on every model invocation. That's excessive; why do that? Take the example of an encoder, we need to encode millions of documents, and yet we'll call eval on every call. I would respectfully disagree.

Hmm, well considering that Sentence Transformers actually does this for their sentence encoder models already whenever you call the function encode

    def encode(self, sentences: Union[str, List[str]],
               batch_size: int = 32,
               show_progress_bar: bool = None,
               output_value: str = 'sentence_embedding',
               convert_to_numpy: bool = True,
               convert_to_tensor: bool = False,
               device: str = None,
               normalize_embeddings: bool = False) -> Union[List[Tensor], ndarray, Tensor]:
        """
        Computes sentence embeddings
        :param sentences: the sentences to embed
        :param batch_size: the batch size used for the computation
        :param show_progress_bar: Output a progress bar when encode sentences
        :param output_value:  Default sentence_embedding, to get sentence embeddings. Can be set to token_embeddings to get wordpiece token embeddings. Set to None, to get all output values
        :param convert_to_numpy: If true, the output is a list of numpy vectors. Else, it is a list of pytorch tensors.
        :param convert_to_tensor: If true, you get one large tensor as return. Overwrites any setting from convert_to_numpy
        :param device: Which torch.device to use for the computation
        :param normalize_embeddings: If set to true, returned vectors will have length 1. In that case, the faster dot-product (util.dot_score) instead of cosine similarity can be used.
        :return:
           By default, a list of tensors is returned. If convert_to_tensor, a stacked tensor is returned. If convert_to_numpy, a numpy matrix is returned.
        """
        self.eval()
        ...

which we call here in Haystack

haystack/haystack/nodes/retriever/_embedding_encoder.py

Line 207 in fa78e2b

emb = self.embedding_model.encode(

we are in fact already calling it every time at model invocation time for our EmbeddingRetriever node whenever we use Sentence Transformers (which we use most often in dC) and has not seemed to impact our timings.

I don't think looping through all layers in a model takes that long, but I can collect some timings on some of the Flan Models to double check.

vblagoje · 2023-01-09T16:25:40Z

Ok ok @sjrl , I wouldn't want you to waste your time on such tests now. Let's just get a nod from other team members as well. I am convinced by your proofs, but HF also doesn't switch the pipeline to eval on every call. Maybe it's an omission. Let's confirm this change with other team members. cc @mayankjobanputra @julian-risch @bogdankostic

bogdankostic · 2023-01-11T10:12:04Z

I don’t have much experience with this but I suppose that setting a model into eval mode shouldn’t be a heavy operation as it only affects a small fraction of layers like Dropout for example.
On the other hand, I’m not really sure if this is really needed for every node. I see that it’s useful for FARMReader because there we provide our own train method that sets the model to training mode. For the other nodes where we don’t provide our own train method, we allow users to load a model only using a model identifier or a local path and pass the task of actually loading the model to transformers - where it is set to eval model by default. The users must have therefore set the models to training mode explicitly themselves, and I think we can expect them to set it back to eval mode as well - but that's just my point of view, interested to hear what others think about this.

vblagoje · 2023-01-11T10:30:01Z

I echo Bogdan's comments. If we were a train-heavy framework - let's go for it. But we only train a few components! Are you concerned about some nefarious user actions @sjrl ? What motivates this change? Consistency? What else?

julian-risch · 2023-01-11T13:27:12Z

It looks to me as if self.train() and self.eval() just set a boolean for the module and all its children ( = neural network layers).
https://github.com/pytorch/pytorch/blob/d24324bf1d2b921c9c631022c9009f4840ee6acb/torch/nn/modules/module.py#L2265
The forward pass of the modules takes into account the value of the boolean, for example, here: https://github.com/pytorch/pytorch/blob/d24324bf1d2b921c9c631022c9009f4840ee6acb/torch/nn/modules/batchnorm.py#L161
So the self.eval() operation itself shouldn't take long. However, I also think that calling self.eval() when a model is loaded should be enough.
In Haystack, we call self.model.train() in the beginning of the training here and thereby set the model into training mode, right? Maybe we could call self.model.eval() at the end of the training before returning the trained model here? Basically ensuring training mode temporarily during training and ensuring eval mode everywhere else because models are loaded in eval mode as Bogdan pointed out.

bogdankostic · 2023-01-11T15:35:47Z

Maybe we could call self.model.eval() at the end of the training before returning the trained model here? Basically ensuring training mode temporarily during training and ensuring eval mode everywhere else because models are loaded in eval mode as Bogdan pointed out.

I like this idea!

vblagoje · 2023-01-12T17:43:41Z

@sjrl what do you think about the ideas ☝️ ?

sjrl · 2023-01-16T11:34:29Z

Maybe we could call self.model.eval() at the end of the training before returning the trained model here? Basically ensuring training mode temporarily during training and ensuring eval mode everywhere else because models are loaded in eval mode as Bogdan pointed out.

This sounds good to me! I'll go ahead and change the PR to do this instead. I'm fairly busy this week so I might not be able to get to it until next week.

…eval

sjrl · 2023-03-27T07:54:11Z

Hey @vblagoje sorry that this took me so long, but it is finished now! I've made the changes as discussed in this PR and it is ready for another review.

vblagoje

LGTM @sjrl , let's resolve this conflict and integrate this one

…eval

coveralls · 2023-05-08T09:00:57Z

Coverage: 38.92% (-0.002%) from 38.921% when pulling 2002b65 on inf-model-eval into 8228081 on main.

sjrl requested a review from a team as a code owner December 30, 2022 12:56

sjrl requested review from vblagoje and removed request for a team December 30, 2022 12:56

sjrl added topic:reader topic:retriever topic:predictions format of predictions, score, probability ... labels Dec 30, 2022

sjrl changed the title ~~fix: Set Reader and Retriever models to eval mode for predictions~~ fix: Set farm and transformer models to eval mode for predictions Dec 30, 2022

sjrl changed the title ~~fix: Set farm and transformer models to eval mode for predictions~~ fix: Ensure eval mode for farm and transformer models for predictions Dec 30, 2022

sjrl force-pushed the inf-model-eval branch from 7751751 to d180d2d Compare December 30, 2022 14:42

sjrl added 13 commits January 4, 2023 13:25

Updated DefaultEmbeddingEncoder to call model.eval() and added test.

6d9d11c

Updated retribert retriever to set model.eval() and added test

01d22b5

Updated test to test _SentenceTransformersEmbeddingEncoder

8505cb3

Set model.eval() in FARMReader and TransformerReader when running pre…

4da4c14

…dict or predict_batch. Added test that failed previously.

Added model.eval calls to _get_prediction methods of Inferencer

3ce1bef

Added model.eval to PromptNode and DocumentClassifier. Also added new…

ebb58e4

… unit test for TextToSpeech.

Added model.eval to Text2SparqlRetriever

3c7299e

Fix to unit test

a83eff1

Added model.eval to TransformersTranslator

71b5a4e

Added model.eval() to EntityExtractor and Text2Speech

ab6eab1

Added model.eval to TransformersQueryClassifier and QuestionGenerator

5284ab3

Added model eval to RAGenerator

1335c32

Added model eval to Seq2SeqGenerator

09e8870

sjrl force-pushed the inf-model-eval branch from d180d2d to 09e8870 Compare January 4, 2023 12:25

vblagoje requested changes Jan 9, 2023

View reviewed changes

haystack/nodes/prompt/prompt_node.py Outdated Show resolved Hide resolved

sjrl added 2 commits March 27, 2023 09:29

Merge branch 'main' of github.com:deepset-ai/haystack into inf-model-…

6cb4df7

…eval

Undoing additions of eval as discussed in PR

ad789f7

github-actions bot added topic:tests and removed topic:reader labels Mar 27, 2023

Added self.model.eval() to end of training loop in FARMReader

68def3a

github-actions bot added topic:modeling topic:reader labels Mar 27, 2023

sjrl requested a review from vblagoje March 27, 2023 07:54

Try removing integration tags

1ce9e90

vblagoje reviewed Apr 18, 2023

View reviewed changes

vblagoje approved these changes Apr 18, 2023

View reviewed changes

Merge branch 'main' of github.com:deepset-ai/haystack into inf-model-…

85015d8

…eval

github-actions bot removed the topic:retriever label Apr 18, 2023

sjrl and others added 2 commits April 19, 2023 10:27

Merge branch 'main' of github.com:deepset-ai/haystack into inf-model-…

0f31a6f

…eval

Merge branch 'main' into inf-model-eval

dbbac67

Merge branch 'main' into inf-model-eval

2002b65

masci assigned vblagoje May 22, 2023

vblagoje merged commit 1777b22 into main Jun 6, 2023
45 checks passed

vblagoje deleted the inf-model-eval branch June 6, 2023 11:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Ensure eval mode for farm and transformer models for predictions #3791

fix: Ensure eval mode for farm and transformer models for predictions #3791

sjrl commented Dec 30, 2022 •

edited

Loading

vblagoje left a comment

vblagoje commented Jan 9, 2023

sjrl commented Jan 9, 2023

vblagoje commented Jan 9, 2023

bogdankostic commented Jan 11, 2023

vblagoje commented Jan 11, 2023

julian-risch commented Jan 11, 2023

bogdankostic commented Jan 11, 2023

vblagoje commented Jan 12, 2023

sjrl commented Jan 16, 2023

sjrl commented Mar 27, 2023

vblagoje left a comment

coveralls commented May 8, 2023 •

edited

Loading

fix: Ensure eval mode for farm and transformer models for predictions #3791

fix: Ensure eval mode for farm and transformer models for predictions #3791

Conversation

sjrl commented Dec 30, 2022 • edited Loading

Related Issues

Proposed Changes:

How did you test it?

Notes for the reviewer

Checklist

vblagoje left a comment

Choose a reason for hiding this comment

vblagoje commented Jan 9, 2023

sjrl commented Jan 9, 2023

vblagoje commented Jan 9, 2023

bogdankostic commented Jan 11, 2023

vblagoje commented Jan 11, 2023

julian-risch commented Jan 11, 2023

bogdankostic commented Jan 11, 2023

vblagoje commented Jan 12, 2023

sjrl commented Jan 16, 2023

sjrl commented Mar 27, 2023

vblagoje left a comment

Choose a reason for hiding this comment

coveralls commented May 8, 2023 • edited Loading

sjrl commented Dec 30, 2022 •

edited

Loading

coveralls commented May 8, 2023 •

edited

Loading