Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BatchFeature should cast to np.float32 by default #12862

Open
patrickvonplaten opened this issue Jul 23, 2021 · 0 comments
Open

BatchFeature should cast to np.float32 by default #12862

patrickvonplaten opened this issue Jul 23, 2021 · 0 comments
Assignees
Labels
WIP Label your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress

Comments

@patrickvonplaten
Copy link
Contributor

patrickvonplaten commented Jul 23, 2021

Currently the default dtype for Speech Feature Extractors is numpy.float64 which leads to two problems:

  1. It makes the data processing extremely expensive for the RAM. Many sound formats are stored in int16 (such as .wav) and are then transformed to float64 which unnecessarly increases RAM by a factor of 4. We should at least stick to float32
  2. Currently we have added some hacks to the Wav2Vec2 and Speech2TextTransformer feature extractors to prevent Double vs. Float dtype mismatches:
    input_values = [x.astype(np.float32) for x in input_values]

The main problem is that np.asarray([....]) by default creates a np.float64 array and that we just pass that format along.
=> We should either always cast to float32 in BatchFeature (see here:

) or add a flag dtype to BatchFeature.

@patrickvonplaten

@patrickvonplaten patrickvonplaten self-assigned this Jul 23, 2021
@huggingface huggingface deleted a comment from github-actions bot Aug 24, 2021
@patrickvonplaten patrickvonplaten added the WIP Label your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress label Aug 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
WIP Label your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress
Projects
None yet
Development

No branches or pull requests

1 participant