Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I want to add google's bert as a new feature #1754

Closed
winstarwang opened this issue Mar 4, 2019 · 41 comments
Closed

I want to add google's bert as a new feature #1754

winstarwang opened this issue Mar 4, 2019 · 41 comments
Labels
type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR

Comments

@winstarwang
Copy link

I have tested Google's BERT as a text classifier, it performances very well in my company. So, I want to migrate BERT's power to RASA NLU. I'm workding on it recently, does anyone else interest?

@robinsongh381
Copy link

Can you explain further and perhaps share some part of your work ?

Do you apply Bert on Intent classification or Entity extraction or something else?

@winstarwang
Copy link
Author

I have tested Google's BERT as a text classifier, it performances very well in my company. So, I want to migrate BERT's power to RASA NLU. I'm workding on it recently, does anyone else interest?

Can you explain further and perhaps share some part of your work ?

Do you apply Bert on Intent classification or Entity extraction or something else?

I apply Bert on Intent classification, work language is chinese

@robinsongh381
Copy link

did u append the word vector of each token from Bert to the Featurization ?

@akelad
Copy link
Contributor

akelad commented Mar 4, 2019

Hi @winstarwang we're currently playing around with it ourselves actually, thanks for the suggestion. There's a branch called bert_embeddings which you can play around with if you like. One of the major concerns we have at the moment is that at prediction time, using BERT takes quite a long time

@akelad akelad added the type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR label Mar 4, 2019
@winstarwang
Copy link
Author

did u append the word vector of each token from Bert to the Featurization ?

yes

@winstarwang
Copy link
Author

Hi @winstarwang we're currently playing around with it ourselves actually, thanks for the suggestion. There's a branch called bert_embeddings which you can play around with if you like. One of the major concerns we have at the moment is that at prediction time, using BERT takes quite a long time

Thanks very much! I will try it later

@prograwertyk
Copy link

Hi @akelad, do you have in plan playing around with LASER from Facebook too? Table on linked article suggest that LASER has better performance than BERT at zero-shot transfer.

@blakeandrewwood
Copy link
Contributor

@winstarwang We've added a BERT component in our component suite if you want to check it out.
https://pypi.org/project/innatis/

@winstarwang
Copy link
Author

@blakeandrewwood thanks very much,I will try it later.

@akelad
Copy link
Contributor

akelad commented Mar 6, 2019

awesome work @blakeandrewwood!
Hey @prograwertyk we've looked into it briefly but haven't tried it yet, definitely worth doing so though, thanks for the suggestion :)

@winstarwang
Copy link
Author

@blakeandrewwood can you remove the dependence on "bert_tfhub_module_handle", I try it failed with the inaccessible

@ShailChoksi
Copy link
Contributor

Any update on this? I looked at the bert_embeddings branch and it seems like some of the BERT code has changed or the code in that branch references functions that aren't in BERTs code.

@akelad
Copy link
Contributor

akelad commented Apr 15, 2019

yeah it's based off of my/rasas fork of the repo. Honestly right now we're not convinced we should be adding it to the repository given inference time is very long and performance improvement isn't huge. But we're still running a few more experiments and will let you know

@ShailChoksi
Copy link
Contributor

ShailChoksi commented Apr 15, 2019

Thanks I will take a look at the fork!
Can you give a specific number of how much slow down we will be seeing if BERT is implemented?

@ShailChoksi
Copy link
Contributor

I would also suggest looking at this repo: https://github.com/hanxiao/bert-as-service
as there is a significant speedup according to this thread: google-research/bert#18

@akelad
Copy link
Contributor

akelad commented Apr 16, 2019

yeah we looked into that repo, it seems to require an insane amount of computing power though, so for us it was simpler to just use the existing code

@ShailChoksi
Copy link
Contributor

For people wondering what the numbers are for inference time. I ran a few tests on one of our models:

Non-BERT model: ~0.04 seconds per request
model with BERT features: ~13 seconds per request

@winstarwang
Copy link
Author

I have made a clone of rasa-nlu-0.14.4 and add bert feature. If someone has interest with it, you can go to https://github.com/winstarwang/rasa_nlu_bert

@winstarwang
Copy link
Author

For people wondering what the numbers are for inference time. I ran a few tests on one of our models:

Non-BERT model: ~0.04 seconds per request
model with BERT features: ~13 seconds per request

do you use the presist() and load() ?

@ShailChoksi
Copy link
Contributor

@winstarwang no I didn't, which would explain the 13 seconds! I looked over your code and I will set it up and run a few tests and get back with the results.
Thanks!

@sibbsnb
Copy link
Contributor

sibbsnb commented Apr 19, 2019

For people wondering what the numbers are for inference time. I ran a few tests on one of our models:

Non-BERT model: ~0.04 seconds per request
model with BERT features: ~13 seconds per request

which environment did you run this? have seen bert standalone executing around 0.5 seconds on cpu.

@anubhavpnp
Copy link

anubhavpnp commented May 5, 2019

I have made a clone of rasa-nlu-0.14.4 and add bert feature. If someone has interest with it, you can go to https://github.com/winstarwang/rasa_nlu_bert

Hi @winstarwang I looked at your repo and set it up..Can you please advise as to how should I go about training the bert module in your repo.I just trained google bert model using this link
https://blog.insightdatascience.com/using-bert-for-state-of-the-art-pre-training-for-natural-language-processing-1d87142c29e7 and it worked fine. But how should I train it when it is integrated with RASA?Can I provide the model directory path(already trained using google bert) somewhere?Please advise

@anubhavpnp
Copy link

anubhavpnp commented May 5, 2019

yeah it's based off of my/rasas fork of the repo. Honestly right now we're not convinced we should be adding it to the repository given inference time is very long and performance improvement isn't huge. But we're still running a few more experiments and will let you know

hi @akelad ..I just trained a model using google bert and and another model using RASA.I used the exactly same dataset for training in both the models.I can defnitely tell you that google bert is working like a charm and predicting intents perfectly whereas RASA is going haywire on many of the sentences that I give it for predicting intents(I used tensorflow embedding pipeline in RASA). Although I am in the process of integrating bert with RASA and then will repeat the tests again.

@akelad
Copy link
Contributor

akelad commented May 6, 2019

@anubhavpnp what kind of dataset is this?

@sibbsnb
Copy link
Contributor

sibbsnb commented May 7, 2019

yeah it's based off of my/rasas fork of the repo. Honestly right now we're not convinced we should be adding it to the repository given inference time is very long and performance improvement isn't huge. But we're still running a few more experiments and will let you know

hi @akelad ..I just trained a model using google bert and and another model using RASA.I used the exactly same dataset for training in both the models.I can defnitely tell you that google bert is working like a charm and predicting intents perfectly whereas RASA is going haywire on many of the sentences that I give it for predicting intents(I used tensorflow embedding pipeline in RASA). Although I am in the process of integrating bert with RASA and then will repeat the tests again.

Bert is way way better than other models when i have tried it standalone.

@anubhavpnp
Copy link

@anubhavpnp what kind of dataset is this?

@akelad Its a dataset containing about 14500 sentences which are divided into 7 classes. i.e. 2.2 k sentences in each dataset of a class. I used the same dataset in both RASA and standalone google bert(according to the format of each) and then gave sample sentences to predict the classes. The formation of the sentence was very different then the training datasets.

@anubhavpnp
Copy link

I have made a clone of rasa-nlu-0.14.4 and add bert feature. If someone has interest with it, you can go to https://github.com/winstarwang/rasa_nlu_bert

hi @winstarwang ..I have used your branch but I am getting below error when I try to train RASA nlu
2019-05-09 16:53:10 INFO rasa_nlu.model - Starting to train component tokenizer_bert
2019-05-09 16:53:13 INFO rasa_nlu.model - Finished training component.
2019-05-09 16:53:13 INFO rasa_nlu.model - Starting to train component ner_crf
Traceback (most recent call last):
File "/usr/local/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/local/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.6/site-packages/rasa_nlu/train.py", line 184, in
num_threads=cmdline_args.num_threads)
File "/usr/local/lib/python3.6/site-packages/rasa_nlu/train.py", line 154, in do_train
interpreter = trainer.train(training_data, **kwargs)
File "/usr/local/lib/python3.6/site-packages/rasa_nlu/model.py", line 196, in train
**context)
File "/usr/local/lib/python3.6/site-packages/rasa_nlu/extractors/crf_entity_extractor.py", line 140, in train
dataset = self._create_dataset(filtered_entity_examples)
File "/usr/local/lib/python3.6/site-packages/rasa_nlu/extractors/crf_entity_extractor.py", line 149, in _create_dataset
dataset.append(self._from_json_to_crf(example, entity_offsets))
File "/usr/local/lib/python3.6/site-packages/rasa_nlu/extractors/crf_entity_extractor.py", line 453, in _from_json_to_crf
ents = self._bilou_tags_from_offsets(tokens, entity_offsets)
File "/usr/local/lib/python3.6/site-packages/rasa_nlu/extractors/crf_entity_extractor.py", line 473, in _bilou_tags_from_offsets
starts = {token.offset: i for i, token in enumerate(tokens)}
File "/usr/local/lib/python3.6/site-packages/rasa_nlu/extractors/crf_entity_extractor.py", line 473, in
starts = {token.offset: i for i, token in enumerate(tokens)}
AttributeError: 'str' object has no attribute 'offset'
Makefile:24: recipe for target 'train-nlu' failed
make: *** [train-nlu] Error 1

Below is the content of nlu_config.yml
language: "en"

pipeline:

  • name: "tokenizer_bert"
  • name: "ner_crf"
  • name: "ner_synonyms"
  • name: "intent_featurizer_bert"
  • name: "intent_classifier_tensorflow_bert"
    I am using the RASA starter nlu project at https://github.com/RasaHQ/starter-pack-rasa-nlu
    I built the RASA nlu project using your branch and installed it in my local and then used project at above link. Can you please help.If I replace bert_tokeniser with the standard whitespace_tokeniser, it works

@anubhavpnp
Copy link

I have made a clone of rasa-nlu-0.14.4 and add bert feature. If someone has interest with it, you can go to https://github.com/winstarwang/rasa_nlu_bert

hi @winstarwang ..I notice that you have not handled offsets in the bert_tokenizer because of which the error is coming..Any idea of how to handle that?

@GaoQ1
Copy link
Contributor

GaoQ1 commented Jun 6, 2019

@anubhavpnp
Copy link

Thanks a lot @GaoQ1 ! ..I will try it

@DavidForlen
Copy link

@akelad Is your branch still usable ? I have the same issue as #3674. The import of create_features from bert.extract_features doesn't work, as it doesn't find it.
The package to install is only bert-tensorflow, isn't it ?

@JulianGerhard21
Copy link
Contributor

Hi @GaoQ1

I have seen your implementation of an EmbeddingBERTClassifier. Am I able to use it with rasa version >= 1.0 ? What if I want to use a pretrained BERT model?

Any help in which version is currently working would be appreciated since I am going to start some experiments into the direction of domain specific fine-tuning on a german BERT.

Regards

@anubhavpnp
Copy link

I have made a clone of rasa-nlu-0.14.4 and add bert feature. If someone has interest with it, you can go to https://github.com/winstarwang/rasa_nlu_bert

Hi @winstarwang ..Your repo was quite useful..Do you have any plans of modifying it so that it is compatible with rasa x ?Your code was working for me with old RASA versions. But not with rasax anymore.PLease do make the modification..That would be really great

@anubhavpnp
Copy link

@anubhavpnp https://github.com/GaoQ1/rasa_nlu_gq/blob/master/rasa_nlu_gao/featurizers/bert_vectors_featurizer.py I already added it.

Hi @GaoQ1 ..Couple of questions:

  1. I only need to use entity extraction feature as my intent classification after integrating with bert is working fine. As I see in you repo, https://github.com/GaoQ1/rasa_nlu_gq/blob/master/rasa_nlu_gao/extractors/bilstm_crf_entity_extractor.py is the file that is doing the entity extraction.
  2. Will the above file require whitespace delimited tokens.In that case, can I use the whitespace tokeniser for RASA repo.The reason I am asking this question is bert expects entity tagging in BILOU format and does not accept the format in which we tag entities in RASA training files. Does your entity extractor handle that conversion(RASA format to BERT format)?
    Please do help as I need entity extraction with bert very badly. All the git repos that I have explored expect data in the format of BILOU which is very different from which RASA expects. PLease do revert.

@shayanghosh49
Copy link

Hi,
My purpose is to run an intent classification exercise on a set of user responses by using RASA NLU. I had earlier used pretrained_embeddings_spacy but now want to try out with BERT embeddings given its obvious advantages. After installing the requirements, I used the following configuration as mentioned here
https://github.com/Revmaker/innatis
I encounter the following errors when I try to train rasa nlu using 'rasa train nlu' command

  1. ImportError: cannot import name 'create_argument_parser'
  2. Exception: Failed to find component class for 'innatis.classifiers.BertIntentClassifier'. Unknown component name. Check your configured pipeline and make sure
    the mentioned component is not misspelled. If you are creating your own component, make sure it is either listed as part of the component_classes in rasa.nl u.registry.py or is a proper name of a class in a module.

How do i get rid of these errors and exceptions?
Is innatis.classifiers.BertIntentClassifier a valid component that can be used to train a RASA model now?

@anubhavpnp
Copy link

Hi,
My purpose is to run an intent classification exercise on a set of user responses by using RASA NLU. I had earlier used pretrained_embeddings_spacy but now want to try out with BERT embeddings given its obvious advantages. After installing the requirements, I used the following configuration as mentioned here
https://github.com/Revmaker/innatis
I encounter the following errors when I try to train rasa nlu using 'rasa train nlu' command

  1. ImportError: cannot import name 'create_argument_parser'
  2. Exception: Failed to find component class for 'innatis.classifiers.BertIntentClassifier'. Unknown component name. Check your configured pipeline and make sure
    the mentioned component is not misspelled. If you are creating your own component, make sure it is either listed as part of the component_classes in rasa.nl u.registry.py or is a proper name of a class in a module.

How do i get rid of these errors and exceptions?
Is innatis.classifiers.BertIntentClassifier a valid component that can be used to train a RASA model now?

I have used this repo and it works fine
https://github.com/winstarwang/rasa_nlu_bert

@winstarwang
Copy link
Author

winstarwang commented Dec 2, 2019

I have made a clone of rasa-nlu-0.14.4 and add bert feature. If someone has interest with it, you can go to https://github.com/winstarwang/rasa_nlu_bert

hi @winstarwang ..I notice that you have not handled offsets in the bert_tokenizer because of which the error is coming..Any idea of how to handle that?

I am so sorry, I didn't get back to you in time because of working on others.
Here is my bert_rasa config file content, hope it will be helpful

language: "zh"
pipeline:
- name: "tokenizer_bert"
  vocab_file: "data/chinese_L-12_H-768_A-12/vocab.txt"
  do_lower_case: True
  max_seq_length: 64
- name: "intent_featurizer_bert"
- name: "intent_classifier_tensorflow_bert"
  # bert config
  # The config json file corresponding to the pre-trained BERT model.
  # This specifies the model architecture.
  "bert_config_file": data/chinese_L-12_H-768_A-12/bert_config.json
  # The vocabulary file that the BERT model was trained on.
  "vocab_file": data/chinese_L-12_H-768_A-12/vocab.txt
  # The output directory where the model checkpoints will be written.
  "output_dir": data/chinese_L-12_H-768_A-12/output
  # Initial checkpoint (usually from a pre-trained BERT model).
  "init_checkpoint": data/chinese_L-12_H-768_A-12/bert_model.ckpt
  # Whether to lower case the input text. Should be True for uncased
  # models and False for cased models.
  "do_lower_case": True

  # training parameters
  # Total batch size for training.
  "train_batch_size": 32
  # Total batch size for eval.
  "eval_batch_size": 8
  # Total batch size for predict.
  "predict_batch_size": 8
  # The initial learning rate for Adam.
  "learning_rate": 0.0005
  # The maximum total input sequence length after WordPiece tokenization.
  # Sequences longer than this will be truncated, and sequences shorter
  # than this will be padded.
  "max_seq_length": 64
  # The num to labels
  "num_labels": 990
  # number of epochs
  "epochs": 10

  # Proportion of training to perform linear learning rate warmup for.
  # E.g., 0.1 = 10% of training.
  "warmup_proportion": 0.1
  # How often to save the model checkpoint.
  "save_checkpoints_steps": 1000
  # How many steps to make in each estimator call.
  "iterations_per_loop": 1000

@winstarwang
Copy link
Author

I have made a clone of rasa-nlu-0.14.4 and add bert feature. If someone has interest with it, you can go to https://github.com/winstarwang/rasa_nlu_bert

Hi @winstarwang ..Your repo was quite useful..Do you have any plans of modifying it so that it is compatible with rasa x ?Your code was working for me with old RASA versions. But not with rasax anymore.PLease do make the modification..That would be really great

I will try to merge the BERT codes to newest RASA version recently.

@ShailChoksi
Copy link
Contributor

@winstarwang I really liked your implementation and I implemented it. I also stumbled upon: https://github.com/JulianGerhard21/bert_spacy_rasa which simplifies a lot of the configurations you have to do. You should check it out and see if it's worth implementing your solution or the solution in bert_spacy_rasa.

@stale
Copy link

stale bot commented Mar 1, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the status:stale label Mar 1, 2020
@tabergma
Copy link
Contributor

tabergma commented Mar 2, 2020

We added support for BERT and other known language models in our latest release 1.8.0.

In order to use the pre-trained word embeddings from BERT, you need to include the following components in your pipeline:

pipeline:
  - name: HFTransformersNLP
    # Name of the language model to use
    model_name: "bert"
  - name: "LanguageModelTokenizer"
  - name: "LanguageModelFeaturizer"

You can read more about those components on our docs (https://rasa.com/docs/rasa/nlu/components/).

@stale stale bot removed the status:stale label Mar 2, 2020
@akelad akelad closed this as completed Mar 2, 2020
taytzehao pushed a commit to taytzehao/rasa that referenced this issue Jul 14, 2023
Bumps [github.com/docker/docker](https://github.com/docker/docker) from 23.0.3+incompatible to 23.0.4+incompatible.
- [Release notes](https://github.com/docker/docker/releases)
- [Commits](moby/moby@v23.0.3...v23.0.4)

---
updated-dependencies:
- dependency-name: github.com/docker/docker
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR
Projects
None yet
Development

No branches or pull requests