Whisper48

English | 简体中文(编写中)

This README file serves as a summary of the structure and content of the project. Detailed explanation for the script and helping information can be found on my website.

云端运行 Working on Google Colab

WhisperX: This project is deployed as a notebook on Google Colab, click to open. Follow the instructions and excecute each cell in order.

本地运行 Working locally

Try running this script on your own computer. You are going to need adequate knowledge to install PyTorch, WhisperX, ffmpeg, ... on your computer first. The script is not tested yet.

示例 Examples

I will update some subtitile examples generated by WhisperX and comparison with other models in the near future. This will include subtitles for short videos, long videos, and music-containing videos. However corresponding video or audio files will not be provided due to copyright reasons.

参与和帮助 Contribute and Support

Recent update of WhisperX integrated faster-whisper, however Japanese timestamping have become problematic. See Issue #200 V3 sentence segement issue. However faster-whisper integrated in N46Whisper does give shorter and accurate segments. I'm still working on this problem and try to add forced alignment to N46Whisper's output.

I modify this project strongly according to my demands. But feel free to submit issues or directly contact me if you have any suggestions. I also encourage you to contribute to the "original" N46Whisper project as well.

Contact me anytime by email: yfwu0202 AT gmail dot com. I would love to help with any questions and hear any kind suggestions from you.

Please also consult documentations for Whisper, WhisperX, and faster-whisper.

致谢和版权 Acknowledgements and Copyright notice

This project started as a fork from N46Whisper. Part of the code was left unchanged and used under MIT license. Modifications were made to incorporate the usage of more accurate Whisper-based models (WhisperX for example) and to adapt for other personal demands.

This script relies on WhisperX, which provides an improvement to OpenAI's Whisper with more accurate and even word-level timestamps. This is achieved by forcing align the inaccurate timestamps generated by whisper with some speech model (wav2vec2.0 for example).

This project is released under the MIT license. See LICENSE for further details.

本说明文档的最后更新时间为 2023-05-08。
This README file was last modified on 2023-05-08.

Name		Name	Last commit message	Last commit date
Latest commit History 172 Commits
files		files
outdated		outdated
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
README_CN.md		README_CN.md
WhisperX48.ipynb		WhisperX48.ipynb
subcut.py		subcut.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

files

files

outdated

outdated

.gitignore

.gitignore

LICENSE.md

LICENSE.md

README.md

README.md

README_CN.md

README_CN.md

WhisperX48.ipynb

WhisperX48.ipynb

subcut.py

subcut.py

Repository files navigation

Whisper48

云端运行 Working on Google Colab

本地运行 Working locally

示例 Examples

参与和帮助 Contribute and Support

致谢和版权 Acknowledgements and Copyright notice

About

Releases

Packages

Languages

License

ifeimi/whisper48

Folders and files

Latest commit

History

Repository files navigation

Whisper48

云端运行 Working on Google Colab

本地运行 Working locally

示例 Examples

参与和帮助 Contribute and Support

致谢和版权 Acknowledgements and Copyright notice

About

Topics

Resources

License

Stars

Watchers

Forks

Languages