Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example for using Wav2Vec2 SpeechToText from Huggingface #1939

Merged
merged 6 commits into from
Nov 3, 2022

Conversation

altre
Copy link
Contributor

@altre altre commented Nov 2, 2022

Description

This is a simple step by step example of how to use torchserve to download and serve a speech to text model from huggingface.

Fixes #1656

Type of change

Added an example

Feature/Issue validation/testing

Output of following the instruction steps should be as following:

❯ conda env create -f environment.yml
Collecting package metadata (repodata.json): done
Solving environment: done
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
Installing pip dependencies: \ Ran pip subprocess with arguments:
['/Users/alanschelten/miniconda3/envs/wav2vec2env/bin/python', '-m', 'pip', 'install', '-U', '-r', '/Users/alanschelten/serve/examples/speech2text_wav2vec2/condaenv.r8y4m9_t.requirements.txt']
Pip subprocess output:
Collecting torch-workflow-archiver
  Using cached torch_workflow_archiver-0.2.4-1-py2.py3-none-any.whl (12 kB)
Collecting torch-model-archiver
  Using cached torch_model_archiver-0.6.0-py2.py3-none-any.whl (14 kB)
Collecting torchserve
  Using cached torchserve-0.6.0-1-py2.py3-none-any.whl (19.6 MB)
Collecting enum-compat
  Using cached enum_compat-0.0.3-py3-none-any.whl (1.3 kB)
Collecting future
  Using cached future-0.18.2-py3-none-any.whl
Collecting psutil
  Using cached psutil-5.9.3-cp310-cp310-macosx_11_0_arm64.whl (243 kB)
Requirement already satisfied: wheel in /Users/alanschelten/miniconda3/envs/wav2vec2env/lib/python3.10/site-packages (from torchserve->-r /Users/alanschelten/serve/examples/speech2text_wav2vec2/condaenv.r8y4m9_t.requirements.txt (line 3)) (0.37.1)
Requirement already satisfied: packaging in /Users/alanschelten/miniconda3/envs/wav2vec2env/lib/python3.10/site-packages (from torchserve->-r /Users/alanschelten/serve/examples/speech2text_wav2vec2/condaenv.r8y4m9_t.requirements.txt (line 3)) (21.3)
Collecting Pillow
  Using cached Pillow-9.3.0-cp310-cp310-macosx_11_0_arm64.whl (2.9 MB)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /Users/alanschelten/miniconda3/envs/wav2vec2env/lib/python3.10/site-packages (from packaging->torchserve->-r /Users/alanschelten/serve/examples/speech2text_wav2vec2/condaenv.r8y4m9_t.requirements.txt (line 3)) (3.0.9)
Installing collected packages: torch-workflow-archiver, enum-compat, psutil, Pillow, future, torchserve, torch-model-archiver
Successfully installed Pillow-9.3.0 enum-compat-0.0.3 future-0.18.2 psutil-5.9.3 torch-model-archiver-0.6.0 torch-workflow-archiver-0.2.4 torchserve-0.6.0

done
#
# To activate this environment, use
#
#     $ conda activate wav2vec2env
#
# To deactivate an active environment, use
#
#     $ conda deactivate

Retrieving notices: ...working... done
❯ conda activate wav2vec2env
❯ ./download_wav2vec2.py
Some weights of Wav2Vec2ForCTC were not initialized from the model checkpoint at facebook/wav2vec2-base-960h and are newly initialized: ['wav2vec2.masked_spec_embed']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
❯ ./archive_model.sh
❯ torchserve --start --model-store model_store --models Wav2Vec2=Wav2Vec2.mar --ncs
TorchServe is already running, please use torchserve --stop to stop TorchServe.
❯ torchserve --start --model-store model_store --m
❯ torchserve --stop
TorchServe has stopped.
❯ torchserve --start --model-store model_store --models Wav2Vec2=Wav2Vec2.mar --ncs
<...server output...>
❯ curl -X POST http://127.0.0.1:8080/predictions/Wav2Vec2 --data-binary '@./sample.wav' -H "Content-Type: audio/basic"
I HAD THAT CURIOSITY BESIDE ME AT THIS MOMENT% 

Checklist:

  • Did you have fun?
  • Have you added tests that prove your fix is effective or that this feature works?
  • Has code been commented, particularly in hard-to-understand areas?
  • Have you made corresponding changes to the documentation?

Copy link
Member

@msaroufim msaroufim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nicely done, thank you!

Copy link
Collaborator

@agunapal agunapal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @altre for the PR . Looks good overall. Please include the output of the sample example in the README

@altre
Copy link
Contributor Author

altre commented Nov 3, 2022

Ok, @agunapal , added the output

Copy link
Collaborator

@agunapal agunapal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good. Thanks @altre

@codecov
Copy link

codecov bot commented Nov 3, 2022

Codecov Report

Merging #1939 (b6edf9e) into master (33e1e97) will not change coverage.
The diff coverage is n/a.

❗ Current head b6edf9e differs from pull request most recent head 8f72ac0. Consider uploading reports for the commit 8f72ac0 to get more accurate results

@@           Coverage Diff           @@
##           master    #1939   +/-   ##
=======================================
  Coverage   44.95%   44.95%           
=======================================
  Files          63       63           
  Lines        2609     2609           
  Branches       56       56           
=======================================
  Hits         1173     1173           
  Misses       1436     1436           

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@msaroufim msaroufim merged commit e18d8b8 into pytorch:master Nov 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Any examples on serving Speech2Text models from Huggingface, such as Wav2Vec2 ?
3 participants