Non deterministic NLU training on GPU #6040

fparga · 2020-06-17T13:27:24Z

Rasa version: 1.9.4

Python version: 3.6.9

Operating system (windows, osx, ...): linux

Issue:
NLU training on GPU is non reproducible, each training with the same pipeline and the same train set gives a model that perform differently. I get that GPU training is by nature non-deterministic (although I think some progress have been made on that front), but the variations we see here are significant and makes it really hard to compare the performance impact of pipeline configuration/parameter tuning.

We don't have this problem when training on CPU.

Note: we have 2701 intent examples (5 distinct intents)

Command or request that led to the issue:

rasa train nlu  --nlu training_data.json
rasa test nlu -m models --successes --no-plot --nlu test_data.json

Content of configuration file (config.yml):

language: fr
pipeline:
  - name: WhitespaceTokenizer
    case_sensitive: False
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    epochs: 100
    random_seed: 4321

The text was updated successfully, but these errors were encountered:

sara-tagger · 2020-06-18T06:00:17Z

Thanks for the issue, @alwx will get back to you about it soon!

You may find help in the docs and the forum, too 🤗

fparga · 2020-06-18T17:56:49Z

Some more context, from our experiments, the predicted intents seems to be the same when running tests on two models trained with the same train set (and pipeline) on GPU. It's the confidence scores that varies wildly, whereas they are identical when the training is done on CPU.

stale · 2020-09-19T09:32:23Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale · 2020-09-29T05:56:16Z

This issue has been automatically closed due to inactivity. Please create a new issue if you need more help.

fparga added area:rasa-oss 🎡 Anything related to the open source Rasa framework type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors. labels Jun 17, 2020

stale bot added the stale label Sep 19, 2020

stale bot closed this as completed Sep 29, 2020

koernerfelicia mentioned this issue Dec 7, 2021

Scheduled Model Regression Test Performance Drops #10415

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non deterministic NLU training on GPU #6040

Non deterministic NLU training on GPU #6040

fparga commented Jun 17, 2020 •

edited

Loading

sara-tagger commented Jun 18, 2020

fparga commented Jun 18, 2020

stale bot commented Sep 19, 2020

stale bot commented Sep 29, 2020

Non deterministic NLU training on GPU #6040

Non deterministic NLU training on GPU #6040

Comments

fparga commented Jun 17, 2020 • edited Loading

sara-tagger commented Jun 18, 2020

You may find help in the docs and the forum, too 🤗

fparga commented Jun 18, 2020

stale bot commented Sep 19, 2020

stale bot commented Sep 29, 2020

fparga commented Jun 17, 2020 •

edited

Loading