Skip to content

ifeimi/whisper48

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Whisper48

English | 简体中文(编写中)

This README file serves as a summary of the structure and content of the project. Detailed explanation for the script and helping information can be found on my website.

云端运行 Working on Google Colab

WhisperX: This project is deployed as a notebook on Google Colab, click to open. Follow the instructions and excecute each cell in order.

本地运行 Working locally

Try running this script on your own computer. You are going to need adequate knowledge to install PyTorch, WhisperX, ffmpeg, ... on your computer first. The script is not tested yet.

示例 Examples

I will update some subtitile examples generated by WhisperX and comparison with other models in the near future. This will include subtitles for short videos, long videos, and music-containing videos. However corresponding video or audio files will not be provided due to copyright reasons.

参与和帮助 Contribute and Support

Recent update of WhisperX integrated faster-whisper, however Japanese timestamping have become problematic. See Issue #200 V3 sentence segement issue. However faster-whisper integrated in N46Whisper does give shorter and accurate segments. I'm still working on this problem and try to add forced alignment to N46Whisper's output.

I modify this project strongly according to my demands. But feel free to submit issues or directly contact me if you have any suggestions. I also encourage you to contribute to the "original" N46Whisper project as well.

Contact me anytime by email: yfwu0202 AT gmail dot com. I would love to help with any questions and hear any kind suggestions from you.

Please also consult documentations for Whisper, WhisperX, and faster-whisper.

致谢和版权 Acknowledgements and Copyright notice

This project started as a fork from N46Whisper. Part of the code was left unchanged and used under MIT license. Modifications were made to incorporate the usage of more accurate Whisper-based models (WhisperX for example) and to adapt for other personal demands.

This script relies on WhisperX, which provides an improvement to OpenAI's Whisper with more accurate and even word-level timestamps. This is achieved by forcing align the inaccurate timestamps generated by whisper with some speech model (wav2vec2.0 for example).

This project is released under the MIT license. See LICENSE for further details.

本说明文档的最后更新时间为 2023-05-08。
This README file was last modified on 2023-05-08.


GitHub

About

Speech-to-text transcriber based on WhisperX, deployed on Google Colab

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.0%
  • Python 1.0%