Great app, really fast on my iPhone!
I was wondering if it'd be possible to get word level timestamps and/or speaker diarization.
I noticed ggml whisper has introduced diarization recently, so perhaps it's possible in your app now?
Would be great for podcasts or meetings.