-
Notifications
You must be signed in to change notification settings - Fork 46
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Please add some parameters for standardizing/beautifying subtitle layout #68
Comments
The standalone version doesn’t appear to have any source code so I can’t decipher what’s happening. We use stable-ts, but there are different ways to split the dialogue. See https://github.com/jianfch/stable-ts?tab=readme-ov-file#regrouping-words. Open to any suggestions. |
I made a separate branch if you want to toy with the idea: https://github.com/McCloudS/subgen/blob/Custom-Params/subgen.py It takes Instructions pasted below:` Regroup (in-place) words into segments.
|
I'm still toying around, but |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Hey, I'm a windows user, and I'm really grateful for Subgen as it's the simplest way to get Whisper running with Bazarr on Windows without having to use Docket etc.
However, one thing I've noticed is that the subtitles aren't formatted the best, due to how Faster-Whisper operates. I've found that the standalone Faster Whisper (https://github.com/Purfview/whisper-standalone-win) has a great optional argument called --standard, which does the following:
--standard: Quick hardcoded preset to split lines in standard way. 42 chars per 2 lines with max_comma_cent=70 and --sentence are activated automatically.
--sentence: Enables splitting lines to sentences for srt and vtt subs. Every sentence starts in the new segment. Be default meant to output whole sentence per line for better translations, but not limited to, read about '--max_...' parameters.
This gives the subtitles a much more standardized look that are common across streaming services such as Netflix, BBC etc.
Is it possible to implement these into SubGen, please?
The text was updated successfully, but these errors were encountered: