Skip to content
This repository has been archived by the owner on Jan 15, 2024. It is now read-only.

[reorg] Move NER script out of bert folder #1090

Merged
merged 15 commits into from Jan 16, 2020

Conversation

eric-haibin-lin
Copy link
Member

Description

#1081
Add a "sequence labeling" section in model zoo, with a BERT NER example.

Checklist

Essentials

  • PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • Code is well-documented

Changes

  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

@eric-haibin-lin eric-haibin-lin requested a review from a team as a code owner January 4, 2020 07:15
@codecov
Copy link

codecov bot commented Jan 4, 2020

Codecov Report

Merging #1090 into master will increase coverage by 9.85%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1090      +/-   ##
==========================================
+ Coverage   78.51%   88.37%   +9.85%     
==========================================
  Files          67       66       -1     
  Lines        6266     6290      +24     
==========================================
+ Hits         4920     5559     +639     
+ Misses       1346      731     -615
Impacted Files Coverage Δ
src/gluonnlp/model/block.py 53.19% <100%> (ø) ⬆️
src/gluonnlp/optimizer/bert_adam.py 87.32% <0%> (-5.86%) ⬇️
src/gluonnlp/utils/files.py 45.9% <0%> (-3.12%) ⬇️
src/gluonnlp/optimizer/__init__.py 100% <0%> (ø) ⬆️
src/gluonnlp/utils/version.py 100% <0%> (ø) ⬆️
src/gluonnlp/optimizer/lamb.py
src/gluonnlp/model/sampled_block.py 90.74% <0%> (+0.08%) ⬆️
src/gluonnlp/data/glue.py 98.63% <0%> (+1.81%) ⬆️
src/gluonnlp/model/train/embedding.py 87.17% <0%> (+2.56%) ⬆️
src/gluonnlp/embedding/token_embedding.py 91.79% <0%> (+2.82%) ⬆️
... and 19 more

@mli
Copy link
Member

mli commented Jan 6, 2020

Job PR-1090/3 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-1090/3/index.html

Copy link
Contributor

@leezu leezu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should a test running the finetune_ner / predict_ner scripts be added?

@eric-haibin-lin
Copy link
Member Author

The problem is that we cannot distribute the dataset, so no test was added when it was first merged. But i did test manually with the CoNLL dataset for 1 epoch and it can run successfully.

@leezu
Copy link
Contributor

leezu commented Jan 6, 2020

How about using a small Toy dataset of equivalent format?

@leezu
Copy link
Contributor

leezu commented Jan 6, 2020

My concern is that it's easy to break the script if it's not run at all.

@mli
Copy link
Member

mli commented Jan 15, 2020

Job PR-1090/4 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-1090/4/index.html

@mli
Copy link
Member

mli commented Jan 15, 2020

Job PR-1090/5 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-1090/5/index.html

@leezu leezu added the release focus Progress focus for release label Jan 15, 2020
@mli
Copy link
Member

mli commented Jan 15, 2020

Job PR-1090/6 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-1090/6/index.html

@mli
Copy link
Member

mli commented Jan 15, 2020

Job PR-1090/7 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-1090/7/index.html

@mli
Copy link
Member

mli commented Jan 15, 2020

Job PR-1090/8 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-1090/8/index.html

@mli
Copy link
Member

mli commented Jan 15, 2020

Job PR-1090/9 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-1090/9/index.html

@mli
Copy link
Member

mli commented Jan 16, 2020

Job PR-1090/10 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-1090/10/index.html

Copy link
Contributor

@leezu leezu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but you may still want to add a link in docs/model_zoo.rst so that Sequence Labeling task is also highlighted on the front page (and not only as a small item in the left menu)

@mli
Copy link
Member

mli commented Jan 16, 2020

Job PR-1090/11 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-1090/11/index.html

@eric-haibin-lin eric-haibin-lin merged commit 9150527 into dmlc:master Jan 16, 2020
@eric-haibin-lin
Copy link
Member Author

@leezu added to model_zoo.rst

@eric-haibin-lin eric-haibin-lin deleted the seq-label branch February 2, 2020 06:21
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
release focus Progress focus for release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants