Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve naming #18

Closed
7 tasks done
kengelbrecht opened this issue Jul 18, 2022 · 4 comments
Closed
7 tasks done

Improve naming #18

kengelbrecht opened this issue Jul 18, 2022 · 4 comments
Assignees

Comments

@kengelbrecht
Copy link
Collaborator

kengelbrecht commented Jul 18, 2022

  • NLUdataset.to_json/from_json are not compatible -> change interface so that what has been written with to_json() can be loaded with from_json()

Optimierung von Namen

  • rasa -> rasa2 (because the outdated rasa version should not be named like the default one)
  • NLUdataset -> NluDataset
  • NLUdataset.from_joined() -> NLUdataset.concat() (bzw. äquivalent zu Pandas benennen)
  • Make classifier names more consistent.
    - TfidfIntentClassifier -> OK
    - SpacyClassifier -> Spacy
    - TelekomModel -> CharNgramIntentClassifier
    - Rasa2/3 -> OK
    - Watson -> WatsonAssistant
    - LUIS -> Luis
    - FastText -> OK
  • Clean up vendor file names
    - telekom -> char_ngram_intent_classifier
    - watson -> watson_assistant
  • Change datasets.py -> nlu_dataset.py to avoid name conflicts with huggingface datasets and be more consistent in nameing (file called like class) and also change vendors.py -> vendor.py for consistency with nlu_dataset

Reasoning behind the vendor name changes:

  • Should be short
  • Should not make claims that may not hold in the future (e.g. SpacyClassifier might in fact be extended with entity recognition)
  • TelekomModel was renamed to avoid branding issues and avoid someone believing Telekom sells this.
  • Watson renamed to WatsonAssistant as Watson has also other NLU products, so was unclear which is used before
  • LUIS to Luis so Acronyms are written in the same way everywhere
  • Added "IntentClassifier" for our own implementations to add clarity (these will not be extended with entity recognition)
@kengelbrecht kengelbrecht self-assigned this Jul 18, 2022
@kengelbrecht
Copy link
Collaborator Author

Created branch feature/better_names

  • Rasa now Rasa2 (also changed file name to rasa2)
  • NLUdataset now NluDataset

@kengelbrecht
Copy link
Collaborator Author

Added function nlubridge.concat(dataset_list) analoguous to pandas.concat().
Added NluDataset method joined = NluDataset.join(other_dataset) analoguous to pandas.DataFrame.join().
Added DeprecationWarning for NluDataset method NluDataset.from_joined; should use one of the new methods (most likely concat() in this case)

@kengelbrecht
Copy link
Collaborator Author

to_json/from_json already works correctly. Added a test to assert from_json works with the string returned by to_json(path=None).

@kengelbrecht
Copy link
Collaborator Author

Changes as outlined in the description merged to develop with commit 9e93bc8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant