-
Notifications
You must be signed in to change notification settings - Fork 728
Open
Description
🚀 The feature
Add a CTCDecoder.save_to_dir(save_dir: str | Path)
function , which saves the lexicon, tokens, kenlm file, decoder_options, and anything else required to build the decoder to a directory.
Saving the kenlm file either requires support in flashlight-text or passing the path to the CTCDecoder init instead of the KenLM object, so the file can be copied to the save_dir.
Motivation, pitch
HF transformers is looking at changing its dependency on pyctcdecode to the torchaudio CTCDecoder (huggingface/transformers/issues/41230).
In order to support pushing the decoder to the hub, it needs to support something equivalent to pyctcdecode.BeamSearchDecoderCTC.save_to_dir.
I'll be happy to make a PR.
Alternatives
No response
Additional context
No response
Metadata
Metadata
Assignees
Labels
No labels