Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

transcription while app is running in background #16

Open
rawwerks opened this issue Aug 21, 2023 · 7 comments
Open

transcription while app is running in background #16

rawwerks opened this issue Aug 21, 2023 · 7 comments

Comments

@rawwerks
Copy link

This app is amazing! I can't believe you are giving it away for free!

I see that you have the following future feature planned:
Enable background transcription when the app is minimized, allowing users to perform other tasks while the transcription proceeds.

I'm simply filing this issue to vote +1 for this feature. With this feature, I think your app would fully replace the cloud-based transcription apps like Otter.ai. I love the privacy, portability, and simplicity of transcribing on my device --- but I need to be able to do other things in the background, especially when using one of the larger models.

@Saik0s
Copy link
Owner

Saik0s commented Aug 22, 2023

Hello @rawwerks, I appreciate your feedback! Regrettably, iOS doesn't allow long background tasks when the app is minimized. I'm considering trying out audio recording or playback during transcription to see if it continues to work when minimized. Theoretically, it should function because the app remains active while audio is being recorded or played, but this needs testing and there is high chance it would not pass App Store review process. Additionally, I'm already in the process of adding a custom cloud-based transcription feature using whisperX.

@rawwerks
Copy link
Author

rawwerks commented Aug 22, 2023

Regrettably, iOS doesn't allow long background tasks when the app is minimized. I'm considering trying out audio recording or playback during transcription to see if it continues to work when minimized.

Maybe something like having the app "read back" the transcription (text to speech) - which could be muted of course - just to have a "legitimate excuse" for letting the transcription run in the background. I don't know much about the App Store review process, but it seems to me that this audio playback would be a legitimate feature that they shouldn't have a problem with. (They don't need to know the true reason for the audio playback ;) )

@rawwerks
Copy link
Author

Additionally, I'm already in the process of adding a custom cloud-based transcription feature using whisperX.

Awesome! I would definitely pay for this feature, assuming there was reasonable privacy/encryption/TOS in place.

@ldenoue
Copy link

ldenoue commented Sep 4, 2023

@Saik0s I wonder why you'd offer an online version? Speed? Reliability? Otter is able to perform transcription while the app is in the background, e.g. https://www.andyibanez.com/posts/modern-background-tasks-ios13/

Is it what you tried and it doesn't work?

@Saik0s
Copy link
Owner

Saik0s commented Sep 4, 2023

@ldenoue, I am integrating online transcriptions using large model, aiming for speed and quality. Additional cloud-based features, including text refinement via LLM, are also planned.

The background transcription you referenced is implemented already, but its reliability varies. The next update will introduce resumable transcriptions to improve situation with this issue.

@LeetaoGoooo
Copy link

LeetaoGoooo commented Sep 15, 2023

Regrettably, iOS doesn't allow long background tasks when the app is minimized. I'm considering trying out audio recording or playback during transcription to see if it continues to work when minimized.

it works well,really amazing~

@coder543
Copy link

@Saik0s I appreciate how much work you've put into this, and how it is open source.

I know the cloud subscription would be a way to make money off of the app, and I think that makes sense.

I have my own dedicated Linux server for running machine learning models, and I would love to have a way to offload the transcription task from this app to that server. I don't think I would pay $4/mo to use my own server, but I would probably be willing to pay a $5 or $10 one-time fee to unlock that functionality.

So, while you're developing the cloud stuff... if you see an easy way to let the user configure the app to point to a custom server, and if you provided some general guidance of what kind of API to expose for the app to work correctly, I think that would be awesome, even if there was a small fee for that feature to help support this app.

It's your app, of course, so do whatever you think is best. I'm just mentioning something I was thinking about recently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants