Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor, Wav2Vec2Tokenizer #10324

Merged
Merged
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
afe2a24
push to show
patrickvonplaten Feb 22, 2021
f70b70e
small improvement
patrickvonplaten Feb 22, 2021
1b9152e
small improvement
patrickvonplaten Feb 22, 2021
d135f74
Update src/transformers/feature_extraction_utils.py
patrickvonplaten Feb 22, 2021
5246685
Update src/transformers/feature_extraction_utils.py
patrickvonplaten Feb 22, 2021
b6e3d68
implement base
patrickvonplaten Feb 23, 2021
b315373
add common tests
patrickvonplaten Feb 23, 2021
3302c28
make all tests pass for wav2vec2
patrickvonplaten Feb 23, 2021
8b883fe
make padding work & add more tests
patrickvonplaten Feb 24, 2021
93962ca
finalize feature extractor utils
patrickvonplaten Feb 24, 2021
55d2705
add call method to feature extraction
patrickvonplaten Feb 24, 2021
b496346
finalize feature processor
patrickvonplaten Feb 24, 2021
5239bf7
finish tokenizer
patrickvonplaten Feb 24, 2021
c17859e
finish general processor design
patrickvonplaten Feb 24, 2021
f64f25c
finish tests
patrickvonplaten Feb 24, 2021
08e3458
typo
patrickvonplaten Feb 24, 2021
ed9543a
remove bogus file
patrickvonplaten Feb 24, 2021
dca668a
finish docstring
patrickvonplaten Feb 24, 2021
4c7c013
add docs
patrickvonplaten Feb 24, 2021
80edb8b
finish docs
patrickvonplaten Feb 24, 2021
7189c24
small fix
patrickvonplaten Feb 24, 2021
6652130
correct docs
patrickvonplaten Feb 24, 2021
960c27c
save intermediate
patrickvonplaten Feb 25, 2021
d389a9e
load changes
patrickvonplaten Feb 25, 2021
900fee6
apply changes
patrickvonplaten Feb 25, 2021
7482eee
apply changes to doc
patrickvonplaten Feb 25, 2021
08b3ac6
change tests
patrickvonplaten Feb 25, 2021
e2ae501
apply surajs recommend
patrickvonplaten Feb 25, 2021
bfddc7f
final changes
patrickvonplaten Feb 25, 2021
bad66a8
Merge branch 'master' into speech_processor_design
patrickvonplaten Feb 25, 2021
d8ef58f
Apply suggestions from code review
patrickvonplaten Feb 25, 2021
6b5cc29
fix typo
patrickvonplaten Feb 25, 2021
d1aa8ea
fix import
patrickvonplaten Feb 25, 2021
791fbee
correct docstring
patrickvonplaten Feb 25, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -370,6 +370,7 @@ TensorFlow and/or Flax.
main_classes/processors
main_classes/tokenizer
main_classes/trainer
main_classes/feature_extractor

.. toctree::
:maxdepth: 2
Expand Down
33 changes: 33 additions & 0 deletions docs/source/main_classes/feature_extractor.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
..
Copyright 2021 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.


Feature Extractor
-----------------------------------------------------------------------------------------------------------------------

A feature extractor is in charge of preparing read-in audio files for a speech model. This includes feature extraction,
such as processing audio files to, *e.g.*, Log-Mel Spectrogram features, but also padding, normalization, and
conversion to Numpy, PyTorch, and TensorFlow tensors.


PreTrainedFeatureExtractor
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.PreTrainedFeatureExtractor
:members: from_pretrained, save_pretrained, pad


BatchFeature
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.BatchFeature
:members:
20 changes: 17 additions & 3 deletions docs/source/model_doc/wav2vec2.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ Tips:

- Wav2Vec2 is a speech model that accepts a float array corresponding to the raw waveform of the speech signal.
- Wav2Vec2 model was trained using connectionist temporal classification (CTC) so the model output has to be decoded
using :class:`~transformers.Wav2Vec2Tokenizer`.
using :class:`~transformers.Wav2Vec2CTCTokenizer`.


Wav2Vec2Config
Expand All @@ -44,13 +44,27 @@ Wav2Vec2Config
:members:


Wav2Vec2Tokenizer
Wav2Vec2CTCTokenizer
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.Wav2Vec2Tokenizer
.. autoclass:: transformers.Wav2Vec2CTCTokenizer
:members: __call__, save_vocabulary


Wav2Vec2FeatureExtractor
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.Wav2Vec2FeatureExtractor
:members: __call__


Wav2Vec2Processor
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.Wav2Vec2Processor
:members: __call__, from_pretrained, save_pretrained, batch_decode, decode, as_target_processor


Wav2Vec2Model
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down
22 changes: 20 additions & 2 deletions src/transformers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,14 @@
],
"models": [],
# Models
"models.wav2vec2": ["WAV_2_VEC_2_PRETRAINED_CONFIG_ARCHIVE_MAP", "Wav2Vec2Config", "Wav2Vec2Tokenizer"],
"models.wav2vec2": [
"WAV_2_VEC_2_PRETRAINED_CONFIG_ARCHIVE_MAP",
"Wav2Vec2Config",
"Wav2Vec2CTCTokenizer",
"Wav2Vec2Tokenizer",
"Wav2Vec2FeatureExtractor",
"Wav2Vec2Processor",
],
"models.convbert": ["CONVBERT_PRETRAINED_CONFIG_ARCHIVE_MAP", "ConvBertConfig", "ConvBertTokenizer"],
"models.albert": ["ALBERT_PRETRAINED_CONFIG_ARCHIVE_MAP", "AlbertConfig"],
"models.auto": [
Expand Down Expand Up @@ -236,6 +243,7 @@
"TensorType",
"TokenSpan",
],
"feature_extraction_utils": ["PreTrainedFeatureExtractor", "BatchFeature"],
"trainer_callback": [
"DefaultFlowCallback",
"EarlyStoppingCallback",
Expand Down Expand Up @@ -1205,6 +1213,9 @@
xnli_tasks_num_labels,
)

# Feature Extractor
from .feature_extraction_utils import BatchFeature, PreTrainedFeatureExtractor

# Files and general utilities
from .file_utils import (
CONFIG_NAME,
Expand Down Expand Up @@ -1330,7 +1341,14 @@
TransfoXLCorpus,
TransfoXLTokenizer,
)
from .models.wav2vec2 import WAV_2_VEC_2_PRETRAINED_CONFIG_ARCHIVE_MAP, Wav2Vec2Config, Wav2Vec2Tokenizer
from .models.wav2vec2 import (
WAV_2_VEC_2_PRETRAINED_CONFIG_ARCHIVE_MAP,
Wav2Vec2Config,
Wav2Vec2CTCTokenizer,
Wav2Vec2FeatureExtractor,
Wav2Vec2Processor,
Wav2Vec2Tokenizer,
)
from .models.xlm import XLM_PRETRAINED_CONFIG_ARCHIVE_MAP, XLMConfig, XLMTokenizer
from .models.xlm_prophetnet import XLM_PROPHETNET_PRETRAINED_CONFIG_ARCHIVE_MAP, XLMProphetNetConfig
from .models.xlm_roberta import XLM_ROBERTA_PRETRAINED_CONFIG_ARCHIVE_MAP, XLMRobertaConfig
Expand Down