Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

checkpoint_model: True leads to KeyError: ‘val_i_acc’ #8296

Closed
ChrisRahme opened this issue Mar 25, 2021 · 3 comments
Closed

checkpoint_model: True leads to KeyError: ‘val_i_acc’ #8296

ChrisRahme opened this issue Mar 25, 2021 · 3 comments
Labels
area:rasa-oss 🎡 Anything related to the open source Rasa framework type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors.

Comments

@ChrisRahme
Copy link

ChrisRahme commented Mar 25, 2021

Rasa Version : 2.4.2 Rasa SDK Version : 2.4.1 Rasa X Version : 0.38.0 Python Version : 3.8.0 Operating System : Windows-10-10.0.19041-SP0

Issue: checkpoint_model: True leads to KeyError: ‘val_i_acc’

After updating from 2.3.4 to 2.4.2 and training, I get the error below.

It's caused by having checkpoint_model: True in the pipeline. Training works after removing it.

Minimal reproducible steps:

  • rasa init in empty folder
  • Change config.yml by adding checkpoint_model: True to DIET

Error (including full traceback):

2021-03-25 13:27:24.173715: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found 2021-03-25 13:27:24.173831: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. 2021-03-25 13:27:33.418255: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll 2021-03-25 13:27:33.799658: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: pciBusID: 0000:01:00.0 name: GeForce GTX 960M computeCapability: 5.0 coreClock: 1.176GHz coreCount: 5 deviceMemorySize: 4.00GiB deviceMemoryBandwidth: 74.65GiB/s 2021-03-25 13:27:33.803059: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found 2021-03-25 13:27:33.806215: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cublas64_10.dll'; dlerror: cublas64_10.dll not found 2021-03-25 13:27:33.809974: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cufft64_10.dll'; dlerror: cufft64_10.dll not found 2021-03-25 13:27:33.812569: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'curand64_10.dll'; dlerror: curand64_10.dll not found 2021-03-25 13:27:33.815738: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cusolver64_10.dll'; dlerror: cusolver64_10.dll not found 2021-03-25 13:27:33.819405: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cusparse64_10.dll'; dlerror: cusparse64_10.dll not found 2021-03-25 13:27:33.822059: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cudnn64_7.dll'; dlerror: cudnn64_7.dll not found 2021-03-25 13:27:33.822282: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1753] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. Skipping registering GPU devices... The configuration for policies was chosen automatically. It was written into the config file at 'config.yml'. Training NLU model... E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace_venv\lib\site-packages\rasa\utils\train_utils.py:558: UserWarning: constrain_similarities is set to False. It is recommended to set it to True when using cross-entropy loss. It will be set to True by default, Rasa Open Source 3.0.0 onwards. rasa.shared.utils.io.raise_warning( E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace_venv\lib\site-packages\rasa\utils\train_utils.py:531: UserWarning: model_confidence is set to softmax. It is recommended to try using model_confidence=linear_norm to make it easier to tune fallback thresholds. rasa.shared.utils.io.raise_warning( 2021-03-25 13:27:36 INFO rasa.shared.nlu.training_data.training_data - Training data stats: 2021-03-25 13:27:36 INFO rasa.shared.nlu.training_data.training_data - Number of intent examples: 69 (7 distinct intents)

2021-03-25 13:27:36 INFO rasa.shared.nlu.training_data.training_data - Found intents: 'mood_great', 'affirm', 'goodbye', 'mood_unhappy', 'bot_challenge', 'greet', 'deny' 2021-03-25 13:27:36 INFO rasa.shared.nlu.training_data.training_data - Number of response examples: 0 (0 distinct responses) 2021-03-25 13:27:36 INFO rasa.shared.nlu.training_data.training_data - Number of entity examples: 0 (0 distinct entities) 2021-03-25 13:27:36 INFO rasa.nlu.model - Starting to train component WhitespaceTokenizer 2021-03-25 13:27:36 INFO rasa.nlu.model - Finished training component. 2021-03-25 13:27:36 INFO rasa.nlu.model - Starting to train component RegexFeaturizer 2021-03-25 13:27:36 INFO rasa.nlu.model - Finished training component. 2021-03-25 13:27:36 INFO rasa.nlu.model - Starting to train component LexicalSyntacticFeaturizer 2021-03-25 13:27:36 INFO rasa.nlu.model - Finished training component. 2021-03-25 13:27:36 INFO rasa.nlu.model - Starting to train component CountVectorsFeaturizer 2021-03-25 13:27:36 INFO rasa.nlu.featurizers.sparse_featurizer.count_vectors_featurizer - 80 vocabulary slots consumed out of 1080 slots configured for text attribute. 2021-03-25 13:27:36 INFO rasa.nlu.model - Finished training component. 2021-03-25 13:27:36 INFO rasa.nlu.model - Starting to train component CountVectorsFeaturizer 2021-03-25 13:27:36 INFO rasa.nlu.featurizers.sparse_featurizer.count_vectors_featurizer - 697 vocabulary slots consumed out of 1697 slots configured for text attribute. 2021-03-25 13:27:36 INFO rasa.nlu.model - Finished training component. 2021-03-25 13:27:36 INFO rasa.nlu.model - Starting to train component DIETClassifier 2021-03-25 13:27:36.981605: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2021-03-25 13:27:36.998896: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x217abed93f0 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2021-03-25 13:27:36.999071: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2021-03-25 13:27:36.999500: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix: 2021-03-25 13:27:36.999649: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] Epochs: 6%|████████ | 6/100 [00:17<04:49, 3.08s/it, t_loss=6.06, i_acc=0.407]Traceback (most recent call last): File "E:\Program Files\Python\Python38\lib\runpy.py", line 192, in _run_module_as_main return run_code(code, main_globals, None, File "E:\Program Files\Python\Python38\lib\runpy.py", line 85, in run_code exec(code, run_globals) File "E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace_venv\lib\site-packages\rasa__main _ .py", line 134, in main() File "E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace_venv\lib\site-packages\rasa__main _.py", line 116, in main cmdline_arguments.func(cmdline_arguments) File "E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace_venv\lib\site-packages\rasa\cli\train.py", line 58, in train_parser.set_defaults(func=lambda args: train(args, can_exit=True)) File "E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace_venv\lib\site-packages\rasa\cli\train.py", line 90, in train training_result = rasa.train( File "E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace_venv\lib\site-packages\rasa\train.py", line 94, in train return rasa.utils.common.run_in_loop( File "E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace_venv\lib\site-packages\rasa\utils\common.py", line 307, in run_in_loop result = loop.run_until_complete(f) File "E:\Program Files\Python\Python38\lib\asyncio\base_events.py", line 608, in run_until_complete return future.result() File "E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace_venv\lib\site-packages\rasa\train.py", line 163, in train_async return await _train_async_internal( File "E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace_venv\lib\site-packages\rasa\train.py", line 342, in _train_async_internal await _do_training( File "E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace_venv\lib\site-packages\rasa\train.py", line 388, in _do_training model_path = await _train_nlu_with_validated_data( File "E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace_venv\lib\site-packages\rasa\train.py", line 812, in _train_nlu_with_validated_data await rasa.nlu.train( File "E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace_venv\lib\site-packages\rasa\nlu\train.py", line 115, in train interpreter = trainer.train(training_data, **kwargs) File "E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace_venv\lib\site-packages\rasa\nlu\model.py", line 209, in train updates = component.train(working_data, self.config, **context) File "E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace_venv\lib\site-packages\rasa\nlu\classifiers\diet_classifier.py", line 854, in train self.model.fit( File "E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace_venv\lib\site-packages\tensorflow\python\keras\engine\training.py", line 108, in _method_wrapper return method(self, *args, **kwargs) File "E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace_venv\lib\site-packages\rasa\utils\tensorflow\temp_keras_modules.py", line 229, in fit callbacks.on_epoch_end(epoch, epoch_logs) File "E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace_venv\lib\site-packages\tensorflow\python\keras\callbacks.py", line 416, in on_epoch_end callback.on_epoch_end(epoch, numpy_logs) File "E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace_venv\lib\site-packages\rasa\utils\tensorflow\callback.py", line 68, in on_epoch_end if self._does_model_improve(logs): File "E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace_venv\lib\site-packages\rasa\utils\tensorflow\callback.py", line 90, in _does_model_improve [ File "E:\Documents\USJ\Engineering\Semestre 6\FYP\Workspace_venv\lib\site-packages\rasa\utils\tensorflow\callback.py", line 91, in float(current_results[key]) > self.best_metrics_so_far[key] KeyError: 'val_i_acc'

Command or request that led to error:

rasa train

Content of configuration file (config.yml):

language: en

pipeline:

*   name: WhitespaceTokenizer
*   name: RegexFeaturizer
*   name: LexicalSyntacticFeaturizer
*   name: CountVectorsFeaturizer
*   name: CountVectorsFeaturizer
     analyzer: char_wb
     min_ngram: 1
     max_ngram: 4
*   name: DIETClassifier
     epochs: 100
     evaluate_on_number_of_examples: 10
     evaluate_every_number_of_epochs: 5
     checkpoint_model: True
*   name: EntitySynonymMapper
*   name: ResponseSelector
     epochs: 100
*   name: FallbackClassifier
     threshold: 0.3
     ambiguity_threshold: 0.1

policies:

testbot.zip

@ChrisRahme ChrisRahme added area:rasa-oss 🎡 Anything related to the open source Rasa framework type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors. labels Mar 25, 2021
@sara-tagger
Copy link
Collaborator

sara-tagger commented Mar 25, 2021

Exalate commented:

sara-tagger commented:

Thanks for raising this issue, @rctatman will get back to you about it soon


Please also check out the docs and the forum in case your issue was raised there too
🤗

@stale
Copy link

stale bot commented Jul 21, 2021

Exalate commented:

stale[bot] commented:

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jul 21, 2021
@rasabot-exalate rasabot-exalate added area:rasa-oss and removed type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors. area:rasa-oss 🎡 Anything related to the open source Rasa framework labels Mar 15, 2022 — with Exalate Issue Sync
@stale stale bot removed the stale label Mar 15, 2022
@m-vdb m-vdb added area:rasa-oss 🎡 Anything related to the open source Rasa framework and removed area:rasa-oss labels Mar 16, 2022
@rasabot-exalate rasabot-exalate added area:rasa-oss :ferris wheel: stale and removed area:rasa-oss 🎡 Anything related to the open source Rasa framework labels Mar 17, 2022 — with Exalate Issue Sync
@stale stale bot removed stale labels Mar 17, 2022
@rasabot-exalate rasabot-exalate added area:rasa-oss 🎡 Anything related to the open source Rasa framework stale and removed area:rasa-oss :ferris wheel: labels Mar 17, 2022 — with Exalate Issue Sync
@stale stale bot removed stale labels Mar 17, 2022
@m-vdb m-vdb added the type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors. label Oct 10, 2022
@m-vdb
Copy link
Collaborator

m-vdb commented Dec 7, 2022

Closing as stale

@m-vdb m-vdb closed this as not planned Won't fix, can't repro, duplicate, stale Dec 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:rasa-oss 🎡 Anything related to the open source Rasa framework type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors.
Projects
None yet
Development

No branches or pull requests

4 participants