Add new option to generate subtitles by a specific number of words #1729

amolinasalazar · 2023-10-22T16:14:49Z

*Updated according jonwook code review

Added a new word option called --max_words_per_line that will generate subtitles setting a maximum limit of words per segment. This could sound similar to --max_line_width option, but the results are more pleasent for readers IMHO. Here a couple of comparisons using .SRT files:

Notice that --max_words_per_line works as an upper bound of words, but still it will respect the segments in the way that end of sentences can have less words if the remaining number of words in a segment is lower than the max_words_per_line value.
i.e. Segment = [word1, word2, word3, word4, word5] and max_words_per_line = 3
=>Result = [word1, word2, word3] and [word4, word5]
This is not the behaviour we can see using --max_line_width that can leave bigger gaps of time when joining end and beginning of segments:

Subtitles generated with --max_words_per_line look similar of what we can see in Shorts, Reels and other short duration videos.

This is my first contribution, so feel free of changing/comment/improve anything.

Additional notes

The use of --max_line_width will disable the effects of --max_words_per_line.
Manually tested using Python and cli and checked results in .srt and .vtt files (.txt. and .tsv files won't be affected).

ADD warning for max_line_width compatibility

FurkanGozukara · 2023-10-22T16:16:38Z

amazing

khaledbkheet · 2023-10-25T23:17:37Z

from pydub import AudioSegment

song = AudioSegment.from_mp3("good_morning.mp3")

PyDub handles time in milliseconds

ten_minutes = 10 * 60 * 1000

first_10_minutes = song[:ten_minutes]

first_10_minutes.export("good_morning_10.mp3", format="mp3")

FurkanGozukara · 2023-11-06T10:19:40Z

@amolinasalazar which word count do you suggest for youtube?

amolinasalazar · 2023-11-06T17:18:57Z

@amolinasalazar which word count do you suggest for youtube?

Actually I think that's a really personal choice and it can depend on several things.

In the end, the main factor why Reels or Shorts normally have just a couple of words on the screen at a moment is because of the aspect ratio of the videos. Having long subtitle lines for videos watched in mobile phones with a vertical orientation will fill the whole screen with words.

There are other factors like the font size, the speed of the speech or even the complexity of the context. Less words normally create dynamic and impactful videos, ideal for simple and strong messages, but it can be stressful if those have a long duration. For example, I won't set 1-3 words at a time if you are explaining a hard topic as it could be stressful to understand.

So in my opinion, you need to find a comfortable number on your own, but something between 3-6 words can be pleasant in general.

…penai#1729) * ADD parser for new argument --max_words_count * ADD max_words_count in words_options ADD warning for max_line_width compatibility * ADD logic for max_words_count * rename to max_words_per_line * make them kwargs * allow specifying file path by --model * black formatting --------- Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>

demoskalifi · 2023-12-26T09:59:35Z

can I use this using the openai whisper API? if so, how?

Francoyy · 2024-04-10T10:29:02Z

The command name is --max_words_per_line and not --max_words_count (https://github.com/openai/whisper/pull/1729/files#diff-f6accbbb4ebcd3dd6815bf012490d9ba37eb89a65f2124adc95c2a39bc6941b7R422)
An example of command would be
whisper file.mp4 --language English --model large-v3 --output_format srt --word_timestamps True --max_words_per_line 6

amolinasalazar · 2024-04-10T11:38:57Z

The command name is --max_words_per_line and not --max_words_count (https://github.com/openai/whisper/pull/1729/files#diff-f6accbbb4ebcd3dd6815bf012490d9ba37eb89a65f2124adc95c2a39bc6941b7R422) An example of command would be whisper file.mp4 --language English --model large-v3 --output_format srt --word_timestamps True --max_words_per_line 6

True, jonwook renamed this command before merging the PR. I'll update the first comment so there are no confussions.

amolinasalazar added 3 commits October 22, 2023 17:37

ADD parser for new argument --max_words_count

7f0dc9e

ADD max_words_count in words_options

f11faf2

ADD warning for max_line_width compatibility

ADD logic for max_words_count

376acb2

amolinasalazar marked this pull request as ready for review October 22, 2023 16:17

jongwook added 4 commits November 6, 2023 01:11

rename to max_words_per_line

8e5200b

make them kwargs

541adb4

allow specifying file path by --model

832c0e9

black formatting

1775889

jongwook merged commit 6ed314f into openai:main Nov 6, 2023
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new option to generate subtitles by a specific number of words #1729

Add new option to generate subtitles by a specific number of words #1729

amolinasalazar commented Oct 22, 2023 •

edited

Loading

FurkanGozukara commented Oct 22, 2023

khaledbkheet commented Oct 25, 2023

FurkanGozukara commented Nov 6, 2023

amolinasalazar commented Nov 6, 2023

demoskalifi commented Dec 26, 2023

Francoyy commented Apr 10, 2024 •

edited

Loading

amolinasalazar commented Apr 10, 2024

Add new option to generate subtitles by a specific number of words #1729

Add new option to generate subtitles by a specific number of words #1729

Conversation

amolinasalazar commented Oct 22, 2023 • edited Loading

Additional notes

FurkanGozukara commented Oct 22, 2023

khaledbkheet commented Oct 25, 2023

PyDub handles time in milliseconds

FurkanGozukara commented Nov 6, 2023

amolinasalazar commented Nov 6, 2023

demoskalifi commented Dec 26, 2023

Francoyy commented Apr 10, 2024 • edited Loading

amolinasalazar commented Apr 10, 2024

amolinasalazar commented Oct 22, 2023 •

edited

Loading

Francoyy commented Apr 10, 2024 •

edited

Loading