Modified to be able to use HFTransformers NLP in Japanese #5745

harada4atsushi · 2020-04-29T05:12:03Z

Proposed changes:

Issue: HFTransformersNLP dose not work with pretrained Japanese BERT models #5744
Fixed to use BertJapaneseTokenizer instead of BertTokenizer when Japanese model_weights is specified.

Status (please check what you already did):

added some tests for the functionality
updated the documentation
updated the changelog (please check changelog for instructions)
reformat files using black (please check Readme for instructions)

CLAassistant · 2020-04-29T05:12:08Z

All committers have signed the CLA.

sara-tagger · 2020-04-29T06:00:06Z

Thanks for opening a draft pull request 🚀If you have any questions, you can direct them to @tabergma ✨

dakshvar22 · 2020-04-29T07:50:27Z

@harada4atsushi Thanks for suggesting a fix. For better maintainability of the code, I would suggest creating a custom component at this point for yourself. The component can inherit from HFTransformersNLP component and just override _load_model method to load the corresponding japanese specific tokenizer and model.
Adding a conditional construct in Rasa just for one language could get out of hand very soon IMO. There might be a better approach of solving this by using AutoTokenizers but that would also require more amount of refactoring on the internal implementation of HFTransformersNLP

harada4atsushi · 2020-05-01T23:50:03Z

@dakshvar22 Thank you for your comment and suggenstion. Certainly that's right, I attempt to implement custom component.

Modified to be able to use HFTransformers NLP in Japanese

89b530d

harada4atsushi added 3 commits April 29, 2020 16:34

update transformers version

37259eb

fix lint error

0c08517

add HFTransformersNLP testcase

38a5aa9

harada4atsushi closed this May 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modified to be able to use HFTransformers NLP in Japanese #5745

Modified to be able to use HFTransformers NLP in Japanese #5745

harada4atsushi commented Apr 29, 2020

CLAassistant commented Apr 29, 2020 •

edited

Loading

sara-tagger commented Apr 29, 2020

dakshvar22 commented Apr 29, 2020

harada4atsushi commented May 1, 2020

Modified to be able to use HFTransformers NLP in Japanese #5745

Modified to be able to use HFTransformers NLP in Japanese #5745

Conversation

harada4atsushi commented Apr 29, 2020

CLAassistant commented Apr 29, 2020 • edited Loading

sara-tagger commented Apr 29, 2020

dakshvar22 commented Apr 29, 2020

harada4atsushi commented May 1, 2020

CLAassistant commented Apr 29, 2020 •

edited

Loading