Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.0 architecture revamp/9456/language model featurizer2 #9586

Merged
merged 31 commits into from
Sep 13, 2021

Conversation

aeshky
Copy link
Contributor

@aeshky aeshky commented Sep 9, 2021

Proposed changes:

Status (please check what you already did):

  • added some tests for the functionality
  • updated the documentation
  • updated the changelog (please check changelog for instructions)
  • reformat files using black (please check Readme for instructions)

implemented required_packages and supported_languages
renamed `train` to `process_training_data`
supported multiple messages in `process`
@aeshky aeshky requested a review from a team as a code owner September 9, 2021 09:52
@aeshky aeshky requested review from a team, twerkmeister, koernerfelicia and alopez and removed request for a team, twerkmeister, koernerfelicia and alopez September 9, 2021 09:52
@aeshky aeshky changed the title 3.0 architecture revamp/9456/language model featurizer2 Draft PR: 3.0 architecture revamp/9456/language model featurizer2 Sep 9, 2021
@aeshky aeshky changed the title Draft PR: 3.0 architecture revamp/9456/language model featurizer2 3.0 architecture revamp/9456/language model featurizer2 Sep 10, 2021
@aeshky
Copy link
Contributor Author

aeshky commented Sep 10, 2021

@ka-bu do you have capacity to review? I tagged you because I wanted to make sure I correctly used your featurizer classes 😁

Copy link
Contributor

@wochinge wochinge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

rasa/nlu/featurizers/dense_featurizer/lm_featurizer.py Outdated Show resolved Hide resolved
rasa/nlu/featurizers/dense_featurizer/lm_featurizer.py Outdated Show resolved Hide resolved
rasa/nlu/featurizers/dense_featurizer/lm_featurizer.py Outdated Show resolved Hide resolved
rasa/nlu/featurizers/dense_featurizer/lm_featurizer.py Outdated Show resolved Hide resolved
rasa/nlu/featurizers/dense_featurizer/lm_featurizer.py Outdated Show resolved Hide resolved
rasa/nlu/featurizers/dense_featurizer/lm_featurizer.py Outdated Show resolved Hide resolved
rasa/nlu/featurizers/dense_featurizer/lm_featurizer.py Outdated Show resolved Hide resolved
tests/nlu/featurizers/test_lm_featurizer.py Outdated Show resolved Hide resolved
tests/nlu/featurizers/test_lm_featurizer.py Outdated Show resolved Hide resolved
tests/nlu/featurizers/test_lm_featurizer.py Outdated Show resolved Hide resolved
@aeshky
Copy link
Contributor Author

aeshky commented Sep 10, 2021

Looks like some of my tests are failing. The ones I'm able to run locally on my machine are passing. I'll look into the CI ones.

@ka-bu
Copy link
Contributor

ka-bu commented Sep 10, 2021

Is it possible to run the code quality checks locally?

yep, you can run: make formatter, make types, make lint (cf. Makefile in main folder)

Copy link
Contributor

@wochinge wochinge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments but looking great otherwise!

rasa/nlu/featurizers/dense_featurizer/lm_featurizer.py Outdated Show resolved Hide resolved
rasa/nlu/featurizers/dense_featurizer/lm_featurizer.py Outdated Show resolved Hide resolved
rasa/nlu/featurizers/dense_featurizer/lm_featurizer.py Outdated Show resolved Hide resolved
rasa/nlu/registry.py Outdated Show resolved Hide resolved
tests/nlu/featurizers/test_lm_featurizer.py Outdated Show resolved Hide resolved
tests/nlu/featurizers/test_lm_featurizer.py Outdated Show resolved Hide resolved
tests/nlu/featurizers/test_lm_featurizer.py Outdated Show resolved Hide resolved
tests/nlu/featurizers/test_lm_featurizer.py Outdated Show resolved Hide resolved
tests/nlu/featurizers/test_lm_featurizer.py Outdated Show resolved Hide resolved
tests/nlu/featurizers/test_lm_featurizer.py Outdated Show resolved Hide resolved
rasabot and others added 11 commits September 13, 2021 10:40
https://github.com/RasaHQ/rasa into 3.0-architecture-revamp/9456/LanguageModelFeaturizer2

* '3.0-architecture-revamp/9456/LanguageModelFeaturizer2' of https://github.com/RasaHQ/rasa:
  docstrings + main adaptions
  pass persisted oov words via constructor instead of config
  persist OOV_words separately
  migrate `CountVectorsFeaturizer`
  duplicate to prepare migration
changing the signature of test functions.

Co-authored-by: Tobias Wochinger <t.wochinger@rasa.com>
https://github.com/RasaHQ/rasa into 3.0-architecture-revamp/9456/LanguageModelFeaturizer2

* '3.0-architecture-revamp/9456/LanguageModelFeaturizer2' of https://github.com/RasaHQ/rasa:
  convert featurizer (#9596)
https://github.com/RasaHQ/rasa into 3.0-architecture-revamp/9456/LanguageModelFeaturizer2

* '3.0-architecture-revamp/9456/LanguageModelFeaturizer2' of https://github.com/RasaHQ/rasa:
  amend checkpoint test
  tidy up tests
  amend diagnostic data check
  edit component load method
  attempt at adapting remaining unit test
  adjust failing tests and remove duplicate create method
  some more adapted tests
  adapt one more unit test
  adapt response selector and most of unit tests
@aeshky
Copy link
Contributor Author

aeshky commented Sep 13, 2021

Okay I think I addressed all of your comments @wochinge and @ka-bu 🙂

@wochinge
Copy link
Contributor

already on it 🏃🏻

Copy link
Contributor

@wochinge wochinge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

rasa/nlu/featurizers/dense_featurizer/lm_featurizer.py Outdated Show resolved Hide resolved
def required_components(cls) -> List[Type[Component]]:
"""Packages needed to be installed."""
return [Tokenizer]
def validate_config(cls, config: Dict[Text, Any]) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ka-bu What do you think of doing this with Python Protocols in the future? Too many of the components implement this with pass. I think using protocols instead would make this more flexible

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure. I like the idea of protocols, but the nice thing about the abstract method is that it forces everyone to think about this and implement it. With the protocol we might miss some essential validation without even noticing it (although that's not really a problem with validate_config - was just thinking of the other validate_.. where we replaced tokenizer by tokenizer type or so).
And if we use the validate_config in all components eventually then it doesn't make a difference really whether it's a protocol or there's a version in GraphComponent that just passes, no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points 👍🏻 Let's just see how this develops and find the right abstractions then 👍🏻

tests/nlu/featurizers/test_lm_featurizer.py Outdated Show resolved Hide resolved
tests/nlu/featurizers/test_lm_featurizer.py Outdated Show resolved Hide resolved
):
component = LanguageModelFeaturizer(
{"model_name": model_name}, skip_model_load=True
monkeypatch.setattr(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Way better than having production code changed for pytest 🚀

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one thing: should we move the monkeypatching into the fixture to avoid repeating this?

aeshky and others added 4 commits September 13, 2021 17:26
docstring edits and removing language from config (by Tobi)

Co-authored-by: Tobias Wochinger <t.wochinger@rasa.com>
https://github.com/RasaHQ/rasa into 3.0-architecture-revamp/9456/LanguageModelFeaturizer2

* '3.0-architecture-revamp/9456/LanguageModelFeaturizer2' of https://github.com/RasaHQ/rasa:
  Apply suggestions from code review
@aeshky aeshky merged commit a4ef77d into main Sep 13, 2021
@aeshky aeshky deleted the 3.0-architecture-revamp/9456/LanguageModelFeaturizer2 branch September 13, 2021 16:58
@wochinge
Copy link
Contributor

🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants