-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
3.0 architecture revamp/9456/language model featurizer2 #9586
3.0 architecture revamp/9456/language model featurizer2 #9586
Conversation
implemented required_packages and supported_languages renamed `train` to `process_training_data` supported multiple messages in `process`
Added missing function to class.
Adapted more tests.
@ka-bu do you have capacity to review? I tagged you because I wanted to make sure I correctly used your featurizer classes 😁 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯
Looks like some of my tests are failing. The ones I'm able to run locally on my machine are passing. I'll look into the CI ones. |
yep, you can run: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more comments but looking great otherwise!
https://github.com/RasaHQ/rasa into 3.0-architecture-revamp/9456/LanguageModelFeaturizer2 * '3.0-architecture-revamp/9456/LanguageModelFeaturizer2' of https://github.com/RasaHQ/rasa: docstrings + main adaptions pass persisted oov words via constructor instead of config persist OOV_words separately migrate `CountVectorsFeaturizer` duplicate to prepare migration
changing the signature of test functions. Co-authored-by: Tobias Wochinger <t.wochinger@rasa.com>
removed registry change
https://github.com/RasaHQ/rasa into 3.0-architecture-revamp/9456/LanguageModelFeaturizer2 * '3.0-architecture-revamp/9456/LanguageModelFeaturizer2' of https://github.com/RasaHQ/rasa: convert featurizer (#9596)
https://github.com/RasaHQ/rasa into 3.0-architecture-revamp/9456/LanguageModelFeaturizer2 * '3.0-architecture-revamp/9456/LanguageModelFeaturizer2' of https://github.com/RasaHQ/rasa: amend checkpoint test tidy up tests amend diagnostic data check edit component load method attempt at adapting remaining unit test adjust failing tests and remove duplicate create method some more adapted tests adapt one more unit test adapt response selector and most of unit tests
already on it 🏃🏻 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯
def required_components(cls) -> List[Type[Component]]: | ||
"""Packages needed to be installed.""" | ||
return [Tokenizer] | ||
def validate_config(cls, config: Dict[Text, Any]) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ka-bu What do you think of doing this with Python Protocols in the future? Too many of the components implement this with pass
. I think using protocols instead would make this more flexible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. I like the idea of protocols, but the nice thing about the abstract method is that it forces everyone to think about this and implement it. With the protocol we might miss some essential validation without even noticing it (although that's not really a problem with validate_config
- was just thinking of the other validate_..
where we replaced tokenizer by tokenizer type or so).
And if we use the validate_config
in all components eventually then it doesn't make a difference really whether it's a protocol or there's a version in GraphComponent
that just passes, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good points 👍🏻 Let's just see how this develops and find the right abstractions then 👍🏻
): | ||
component = LanguageModelFeaturizer( | ||
{"model_name": model_name}, skip_model_load=True | ||
monkeypatch.setattr( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Way better than having production code changed for pytest
🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one thing: should we move the monkeypatching into the fixture to avoid repeating this?
docstring edits and removing language from config (by Tobi) Co-authored-by: Tobias Wochinger <t.wochinger@rasa.com>
https://github.com/RasaHQ/rasa into 3.0-architecture-revamp/9456/LanguageModelFeaturizer2 * '3.0-architecture-revamp/9456/LanguageModelFeaturizer2' of https://github.com/RasaHQ/rasa: Apply suggestions from code review
🚀 |
Proposed changes:
LanguageModelFeaturizer
according to MigrateLanguageModelFeaturizer
to GraphComponent interface #9456Status (please check what you already did):
black
(please check Readme for instructions)