Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

whisper_full_get_token_text crash by run in a loop in swift #1652

Open
bitsmakerde opened this issue Dec 18, 2023 · 1 comment
Open

whisper_full_get_token_text crash by run in a loop in swift #1652

bitsmakerde opened this issue Dec 18, 2023 · 1 comment

Comments

@bitsmakerde
Copy link

Hi,

I have the problem that the translation with files which are bigger as 30 seconds are terrible, in the swiftUI example.

to solve this problem I want split my files in 30 seconds parts and loop over them to get the best result with init prompts too.

But if I have more as one file my app crash and I have no idea to fix it.

I get this error:

Selecting 8 threads
About to run whisper_full
WHISPER_ASSERT: /Users/andre/Documents/GitHub/whisper.cpp/whisper.cpp:4531: n_logits == ctx.vocab.n_vocab

here how I loop the my files:

func transcribeSample() {
        if let sampleUrl {
            let segments = [sampleUrl, sampleUrl]

            for segment in segments {
                print("segment: \(segment)")
                transcribeAudio(segment)
            } 
        } else {
            messageLog += "Could not locate sample\n"
        }
    }

And here my config

func fullTranscribe(samples: [Float]) {
        // Leave 2 processors free (i.e. the high-efficiency cores).
        let maxThreads = max(1, min(8, cpuCount() - 2))
        print("Selecting \(maxThreads) threads")
        let myString = "ähm, ah, äh, ahm, äähm, ääh, ähh, uhm, eh, ehm, hmm, mm, mhm, mmm, uh, um. I mean, mean."
        var prompt: UnsafePointer<CChar>?

        myString.withCString { cStr in
            // cStr ist ein UnsafePointer<CChar>, der auf den C-String zeigt
            prompt = UnsafePointer<CChar>(cStr)

            // Beispielhafte Nutzung des Zeigers
        }

        var params = whisper_full_default_params(WHISPER_SAMPLING_GREEDY)
        "de".withCString { en in
            // Adapted from whisper.objc
            params.print_realtime = true
            params.print_progress = false
            params.print_timestamps = true
            params.print_special = true
            params.translate = false
            params.language = en
            params.n_threads = Int32(maxThreads)
            params.offset_ms = 0
            params.no_context = true
            params.single_segment = false
            params.suppress_blank = false
            params.initial_prompt = prompt

            params.max_len = 1
            params.split_on_word = true
            params.token_timestamps = true

            whisper_reset_timings(context)
            print("About to run whisper_full")
            samples.withUnsafeBufferPointer { samples in
                if whisper_full(context, params, samples.baseAddress, Int32(samples.count)) != 0 {
                    print("Failed to run the model")
                } else {
                    whisper_print_timings(context)
                }
            }
        }
    }
@bitsmakerde
Copy link
Author

ok I found a little bit more out, there my app is crashing.

so the first run works perfect, but the second run it's crashing by this line of code:

String(cString: whisper_full_get_token_text(context, i, token))

I want every word together with the start and end time back. I have no idea why if runs one time but the second not. the context is the same like by the first run. the model and the sample file too.

@bitsmakerde bitsmakerde changed the title split audio in 30 sec parts, crash on swift whisper_full_get_token_text crash by run in a loop in swift Dec 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant