Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better handling of sequences longer than maximum sequence length for BERT style models #6567

Merged
merged 11 commits into from
Sep 4, 2020

Conversation

dakshvar22
Copy link
Contributor

Proposed changes:

Status (please check what you already did):

  • added some tests for the functionality
  • updated the documentation
  • updated the changelog (please check changelog for instructions)
  • reformat files using black (please check Readme for instructions)

@github-actions github-actions bot deleted a comment from dakshvar22 Sep 3, 2020
@dakshvar22 dakshvar22 changed the title Better handling sequences longer than maximum sequence length for BERT style models Better handling of sequences longer than maximum sequence length for BERT style models Sep 3, 2020
@github-actions
Copy link
Contributor

github-actions bot commented Sep 3, 2020

Commit: bf9cdb1, The full report is available as an artifact.

Dataset: Carbon Bot

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 1m3s, train: 2m50s, total: 3m52s
0.7709 0.6260 0.5116
BERT + DIET(seq) + ResponseSelector(t2t)
test: 1m7s, train: 3m34s, total: 4m41s
0.7806 0.8143 0.4983

Dataset: Hermit

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 2m29s, train: 19m32s, total: 22m1s
0.8383 0.7487 no data
BERT + DIET(seq) + ResponseSelector(t2t)
test: 2m11s, train: 12m34s, total: 14m44s
0.8903 0.7874 no data

Dataset: Private 1

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 1m35s, train: 4m0s, total: 5m34s
0.8150 0.9612 no data
BERT + DIET(seq) + ResponseSelector(t2t)
test: 1m50s, train: 3m19s, total: 5m8s
0.8763 0.9726 no data

Dataset: Private 2

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 1m32s, train: 3m45s, total: 5m17s
0.7106 no data no data
BERT + DIET(seq) + ResponseSelector(t2t)
test: 1m39s, train: 7m15s, total: 8m53s
0.8339 no data no data

Dataset: Private 3

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 37s, train: 58s, total: 1m35s
0.4733 no data no data
BERT + DIET(seq) + ResponseSelector(t2t)
test: 42s, train: 51s, total: 1m32s
0.6461 no data no data

Dataset: Sara

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 1m46s, train: 5m48s, total: 7m34s
0.7767 0.8683 0.7870
BERT + DIET(seq) + ResponseSelector(t2t)
test: 2m1s, train: 4m12s, total: 6m12s
0.8423 0.8861 0.7913

Copy link
Contributor

@tabergma tabergma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great 🚀 Added some comments.

rasa/nlu/utils/hugging_face/hf_transformers.py Outdated Show resolved Hide resolved
rasa/nlu/utils/hugging_face/hf_transformers.py Outdated Show resolved Hide resolved
rasa/nlu/utils/hugging_face/hf_transformers.py Outdated Show resolved Hide resolved
rasa/nlu/utils/hugging_face/hf_transformers.py Outdated Show resolved Hide resolved
rasa/nlu/utils/hugging_face/hf_transformers.py Outdated Show resolved Hide resolved
rasa/nlu/utils/hugging_face/hf_transformers.py Outdated Show resolved Hide resolved
tests/nlu/utils/test_hf_transformers.py Show resolved Hide resolved
Copy link
Contributor

@tabergma tabergma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my comments 💯

If possible, it would be great to add a test that the inference mode is set to False during training and to True during prediction. But other than that, good to go 👍

@dakshvar22
Copy link
Contributor Author

@tabergma Thanks for the review, I was adding a test for it. Learnt how to test logging statements 😄

@github-actions
Copy link
Contributor

github-actions bot commented Sep 4, 2020

Hey @dakshvar22! 👋 To run model regression tests, comment with the /modeltest command and a configuration.

Tips 💡: The model regression test will be run on push events. You can re-run the tests by re-add status:model-regression-tests label or use a Re-run jobs button in Github Actions workflow.

Tips 💡: Every time when you want to change a configuration you should edit the comment with the previous configuration.

You can copy this in your comment and customize:

/modeltest

```yml
##########
## Available datasets
##########
# - "Carbon Bot"
# - "Hermit"
# - "Private 1"
# - "Private 2"
# - "Private 3"
# - "Sara"

##########
## Available configurations
##########
# - "BERT + DIET(bow) + ResponseSelector(bow)"
# - "BERT + DIET(seq) + ResponseSelector(t2t)"
# - "ConveRT + DIET(bow) + ResponseSelector(bow)"
# - "ConveRT + DIET(seq) + ResponseSelector(t2t)"
# - "Spacy + DIET(bow) + ResponseSelector(bow)"
# - "Spacy + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + ConveRT + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + ConveRT + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + Spacy + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)"

## Example configuration
#################### syntax #################
## include:
##   - dataset: ["<dataset_name>"]
##     config: ["<configuration_name>"]
#
## Example:
## include:
##  - dataset: ["Carbon Bot"]
##    config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]
#
## Shortcut:
## You can use the "all" shortcut to include all available configurations or datasets
#
## Example: Use the "Sparse + EmbeddingIntent + ResponseSelector(bow)" configuration
## for all available datasets
## include:
##  - dataset: ["all"]
##    config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]
#
## Example: Use all available configurations for the "Carbon Bot" and "Sara" datasets
## and for the "Hermit" dataset use the "Sparse + DIET + ResponseSelector(T2T)" and
## "Sparse + ConveRT + DIET + ResponseSelector(T2T)" configurations:
## include:
##  - dataset: ["Carbon Bot", "Sara"]
##    config: ["all"]
##  - dataset: ["Hermit"]
##    config: ["Sparse + DIET(seq) + ResponseSelector(t2t)", "Sparse + ConveRT + DIET(seq) + ResponseSelector(t2t)"]

include:
 - dataset: ["Carbon Bot"]
   config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]

```

@github-actions
Copy link
Contributor

github-actions bot commented Sep 4, 2020

/modeltest

include:
 - dataset: ["all"]
   config: ["BERT + DIET(bow) + ResponseSelector(bow)", "BERT + DIET(seq) + ResponseSelector(t2t)"]

@github-actions
Copy link
Contributor

github-actions bot commented Sep 4, 2020

The model regression tests have started. It might take a while, please be patient.
As soon as results are ready you'll see a new comment with the results.

Used configuration can be found in the comment.

@github-actions
Copy link
Contributor

github-actions bot commented Sep 4, 2020

Commit: deb3c76, The full report is available as an artifact.

Dataset: Carbon Bot

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 1m1s, train: 3m13s, total: 4m14s
0.7709 0.6260 0.5116
BERT + DIET(seq) + ResponseSelector(t2t)
test: 1m10s, train: 3m39s, total: 4m49s
0.7806 0.8143 0.4983

Dataset: Hermit

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 2m30s, train: 19m36s, total: 22m6s
0.8374 0.7487 no data
BERT + DIET(seq) + ResponseSelector(t2t)
test: 2m10s, train: 12m36s, total: 14m46s
0.8903 0.7874 no data

Dataset: Private 1

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 1m33s, train: 3m52s, total: 5m25s
0.8150 0.9612 no data
BERT + DIET(seq) + ResponseSelector(t2t)
test: 1m52s, train: 3m29s, total: 5m21s
0.8763 0.9726 no data

Dataset: Private 2

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 1m36s, train: 3m54s, total: 5m29s
0.7106 no data no data
BERT + DIET(seq) + ResponseSelector(t2t)
test: 1m39s, train: 7m23s, total: 9m2s
0.8339 no data no data

Dataset: Private 3

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 37s, train: 59s, total: 1m36s
0.4733 no data no data
BERT + DIET(seq) + ResponseSelector(t2t)
test: 40s, train: 49s, total: 1m29s
0.6461 no data no data

Dataset: Sara

Configuration Intent Classification Micro F1 Entity Recognition Micro F1 Response Selection Micro F1
BERT + DIET(bow) + ResponseSelector(bow)
test: 1m50s, train: 5m52s, total: 7m42s
0.7767 0.8683 0.7913
BERT + DIET(seq) + ResponseSelector(t2t)
test: 2m1s, train: 4m13s, total: 6m13s
0.8423 0.8861 0.7891

@rasabot rasabot merged commit a360b68 into master Sep 4, 2020
@rasabot rasabot deleted the fix_hf_large_seq branch September 4, 2020 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants