Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add bert base class and restructure encoding extractor #383

Merged
merged 92 commits into from
Apr 27, 2020
Merged
Show file tree
Hide file tree
Changes from 90 commits
Commits
Show all changes
92 commits
Select commit Hold shift + click to select a range
f36b2b5
add base class and restructure encoding extractor
rbroc Feb 27, 2020
1ddc26b
fix structure and add LM extractor
rbroc Mar 2, 2020
05e7670
add one more annotation
rbroc Mar 2, 2020
a89ed65
add
rbroc Mar 2, 2020
20eef0a
fix prediction shape
rbroc Mar 2, 2020
a0fbb8f
start implementing target routine + other fixes
rbroc Mar 3, 2020
27f996a
add softmax as option
rbroc Mar 3, 2020
ac01e15
only allow one mask
rbroc Mar 3, 2020
597835c
add threshold option, refine target tokens option (both mutually excl…
rbroc Mar 3, 2020
60b91f0
edit docstring
rbroc Mar 3, 2020
b4fbb45
allow keep info on true word
rbroc Mar 6, 2020
c66dfb0
move mask specification to extract
rbroc Mar 6, 2020
a9d24f3
fix mask-based indexing in mask method
rbroc Mar 9, 2020
68ddd94
refine logic
rbroc Mar 10, 2020
0f2918f
checkpoint
rbroc Mar 10, 2020
dbf4d20
restore mask in init
rbroc Mar 10, 2020
527c6d4
fix to_df and indexing
rbroc Mar 11, 2020
c8b1c36
notes
rbroc Mar 11, 2020
445ef09
checkpoint
rbroc Mar 18, 2020
92ff1eb
Update pliers/extractors/text.py
rbroc Mar 18, 2020
6089fc6
Update pliers/extractors/text.py
rbroc Mar 18, 2020
c9dc0ea
Merge branch 'bertLM' of https://github.com/rbroc/pliers into bertLM
rbroc Mar 18, 2020
c6020a9
_model_attributes as class attribute
rbroc Mar 18, 2020
9dc891d
check pooling arg before superclass initializer
rbroc Mar 18, 2020
5629a0b
move superclass init after argument validation
rbroc Mar 18, 2020
d74ace4
add docstring to additional methods
rbroc Mar 18, 2020
8dc3202
add self.mask in __init__
rbroc Mar 18, 2020
d4c8c87
fix docstrings
rbroc Mar 18, 2020
0aa2f93
rename return_metadata args
rbroc Mar 18, 2020
f606049
set class to AutoModel to enable any BERT-like model (ALBERT, RoBERTA…
rbroc Mar 18, 2020
6a4c8de
try BertBase as metaclass
rbroc Mar 19, 2020
98a5fa0
simplify children classes
rbroc Mar 19, 2020
835726a
add update_mask method and remove mask from nonLM extractors
rbroc Mar 19, 2020
031cbd1
add prototype sentiment extractor
rbroc Mar 19, 2020
e577149
restore class hierarchy
rbroc Mar 24, 2020
1b9c4ae
checkpoint
rbroc Mar 24, 2020
15e09e4
added full test suite for token-level encoding extractor
rbroc Mar 25, 2020
15a5446
added BertSequenceEncoding test suit
rbroc Mar 25, 2020
e16e743
fix pooling and return_special for sequence extractor; also fix retur…
rbroc Mar 26, 2020
ad62c99
add first bertLM tests
rbroc Mar 26, 2020
5275d70
added all tests for bertLM and sentiment extractor
rbroc Mar 27, 2020
b47bbbb
disable caching for LM test
rbroc Mar 27, 2020
5d99f16
Merge branch 'master' into bertLM
rbroc Mar 27, 2020
6bb970f
resolve conflict
rbroc Mar 27, 2020
846111c
fix ExtractorResult caching issue
rbroc Mar 30, 2020
bff8b24
remove set cache_transformer statement
rbroc Mar 30, 2020
ab92638
fix to_df format
rbroc Mar 30, 2020
c7bde95
fix spacing
rbroc Mar 30, 2020
9f267fb
try tests without tf
rbroc Mar 31, 2020
aa7d9d7
try splitting tests
rbroc Mar 31, 2020
20e520a
try deleting models
rbroc Mar 31, 2020
288858e
fix typo
rbroc Mar 31, 2020
6c12160
remove all after use
rbroc Mar 31, 2020
9c7041e
add debug statement
rbroc Mar 31, 2020
1e53e30
add more debug statements
rbroc Mar 31, 2020
7760d0a
skip all but word counter test
rbroc Mar 31, 2020
73737be
skip models only
rbroc Mar 31, 2020
f77f02c
skip sequence
rbroc Mar 31, 2020
c8ed064
restore all tests
rbroc Mar 31, 2020
3538853
try only base, sequence, sentiment
rbroc Mar 31, 2020
ff7c834
remove custom log_attribute logic from to_df, adapt tests and restore…
rbroc Mar 31, 2020
c1cd500
Merge branch 'master' into bertLM
rbroc Mar 31, 2020
fb7ef79
skip test several models
rbroc Apr 1, 2020
ae97371
only try one test
rbroc Apr 1, 2020
95e3cd3
no _log_attribute tests
rbroc Apr 1, 2020
4ee6dff
disable pytest caching
rbroc Apr 1, 2020
4b9e884
fix shape assertion
rbroc Apr 1, 2020
c98735e
skip test models
rbroc Apr 1, 2020
8950729
no storing of extractors
rbroc Apr 2, 2020
2543d83
no distilbert download
rbroc Apr 2, 2020
13bb469
disable timeout
rbroc Apr 2, 2020
ecd9d9a
remove all other models
rbroc Apr 2, 2020
23b121b
revert
rbroc Apr 3, 2020
736e661
test text ext only
rbroc Apr 3, 2020
1cc1756
no init target
rbroc Apr 3, 2020
f8b6d90
separate file
rbroc Apr 3, 2020
a58e296
try last test edit
rbroc Apr 3, 2020
e9c2edb
revert
rbroc Apr 3, 2020
706100b
do not clear models
rbroc Apr 3, 2020
f876fac
add markers
rbroc Apr 6, 2020
d2fa698
fix travis flag
rbroc Apr 6, 2020
561bd91
mark all as high mem
rbroc Apr 6, 2020
f4a2690
skipif in travis
rbroc Apr 6, 2020
1142d14
clear cache opt
rbroc Apr 6, 2020
e78e921
delete bert_test file
rbroc Apr 6, 2020
0e59c25
checkpoint
rbroc Apr 6, 2020
2ac62df
tf test
rbroc Apr 6, 2020
a8bda72
checkpoint
rbroc Apr 7, 2020
10e157f
skip one more test
rbroc Apr 7, 2020
4a077bf
only run encoding extractor
rbroc Apr 7, 2020
380950b
add docstring to update_mask
rbroc Apr 16, 2020
71d9b7c
remove abc import
rbroc Apr 16, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 2 additions & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,8 +48,8 @@ before_script:
- python -m pliers.support.download
- python -m spacy download en_core_web_sm
script:
- py.test pliers/tests/test_* pliers/tests/converters pliers/tests/filters --cov=pliers --cov-report= -m "not requires_payment" -W ignore::UserWarning
- py.test pliers/tests/extractors --cov=pliers --cov-report= -m "not requires_payment" --cov-append -W ignore::UserWarning
- py.test pliers/tests/test_* pliers/tests/converters pliers/tests/filters --cov=pliers --cov-report= -m "not requires_payment" -W ignore::UserWarning
- py.test pliers/tests/extractors --cov=pliers --cov-report= -m "not requires_payment" --cov-append -W ignore::UserWarning
after_success:
- coveralls
before_cache:
Expand Down
9 changes: 7 additions & 2 deletions pliers/extractors/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,9 @@
NumUniqueWordsExtractor, PartOfSpeechExtractor,
WordEmbeddingExtractor, TextVectorizerExtractor,
VADERSentimentExtractor, SpaCyExtractor,
WordCounterExtractor, PretrainedBertEncodingExtractor)
WordCounterExtractor, BertExtractor,
BertSequenceEncodingExtractor, BertLMExtractor,
BertSentimentExtractor)
from .video import (FarnebackOpticalFlowExtractor)

__all__ = [
Expand Down Expand Up @@ -139,7 +141,10 @@
'BeatTrackExtractor',
'HarmonicExtractor',
'PercussiveExtractor',
'BertExtractor',
'BertSequenceEncodingExtractor',
'BertLMExtractor',
'BertSentimentExtractor',
'AudiosetLabelExtractor',
'PretrainedBertEncodingExtractor',
'WordCounterExtractor'
]