Split segment to words #30

chichiller · 2023-10-24T21:51:15Z

At first thank you for your job
i have a question: when i transcribe audio file as PCM [Float] i receive as result [Segment]
i noticed that each Segment may contain not separate word, but sentence
how i can split sentence into separate words with timestamp for each?
I tried to use WhisperParams fields:

max_len = 1
split_on_word = true

but result always the same
The only thing is help me decrease words in sentence is using beamSearch strategy, but i still get sentence instead of separate words

my code

let params = WhisperParams(strategy: .beamSearch)
params.max_len = 1
params.split_on_word = true
whisper = Whisper(fromFileURL: modelUrl, withParams: params)

The text was updated successfully, but these errors were encountered:

exPHAT · 2023-10-24T22:36:02Z

You can see the correct usage in #6

exPHAT closed this as completed Oct 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split segment to words #30

Split segment to words #30

chichiller commented Oct 24, 2023 •

edited

Loading

exPHAT commented Oct 24, 2023

Split segment to words #30

Split segment to words #30

Comments

chichiller commented Oct 24, 2023 • edited Loading

exPHAT commented Oct 24, 2023

chichiller commented Oct 24, 2023 •

edited

Loading