Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IntentContainer.instantiate_from_disk() method does not load data for padaos regex patterns #23

Closed
repodiac opened this issue Mar 28, 2020 · 1 comment

Comments

@repodiac
Copy link
Contributor

Hi,

I contributed the instantiate_from_disk() functionality some time ago (short recap: you can reuse and load trained models from disk via utilizing cached contents which have been stored externally).

Unfortunately, there is a bug I found out recently. I fixed it already with some kind of workaround. In the following I describe the details/background.

Background:

  • As I already found out and noted down as comment in the source code, the padaos (https://github.com/MycroftAI/padaos) regex "compiler" has to recompile also when loading the trained models from disk.
  • The current interface to this component does not allow to persist patterns to disk and reload them in the same fashion as instantiate_from_disk() does. Another code change and pull request for this project might be necessary here but that can be discussed on the padaos repo website.
  • The bug is/was that line self.padaos.add_entity(name, lines) in method add_entity of intent_container.py contained empty lines - I provided an empty list via instantiate_from_disk() since for reloading/instantiating from disk no training data is (usually) necessary! In this case it is, unfortunately (due to the need to recompile patterns with padaos)
  • The workaround (i.e. fix) was to provide the original training data for both entities and intents to the padaos compiler so that it can work with all the text for outputting the required patterns eventually.

Implications: If no contents for entities and intents are provided to the padas compiler, most entities embedded in intents are not detected and some intents tend to be completely wrong most of the time. So it is rather severe.

Solution: As said before, I would like to contribute the fix via another pull request. The fixed functionality has also been considered in an improved unit test (in test_container.test_instantiate_from_disk())

repodiac added a commit to repodiac/padatious that referenced this issue Mar 28, 2020
repodiac added a commit to repodiac/padatious that referenced this issue Mar 30, 2020
forslund pushed a commit to repodiac/padatious that referenced this issue May 25, 2020
forslund pushed a commit that referenced this issue May 25, 2020
@repodiac
Copy link
Contributor Author

done (merged)

clrpackages pushed a commit to clearlinux-pkgs/padatious that referenced this issue May 28, 2020
… 0.4.8

Kris Gesling (1):
      Update README for repo restructure

Matthew D. Scholefield (1):
      Fix docstring for add_entity

repodiac (3):
      support for persisting models via utilizing cached versions of resources
      adapted code to comply with PEP8 via `autopep8 --in-place --aggressive --aggressive <file.py>`
      fix and tests for issue #23 (MycroftAI/padatious#23)

stratus-ss (4):
      Adds support for handling apostrophes in utterances
      Added a test for match_data.py to make sure apostrophes are handled properly
      check for words that end in an apostrophe and correct spacing
      add test for 's' + apostrophe possessive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant