Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open Source auto-bleeper for podcasts and songs #381

Open
7 of 21 tasks
ppeters0502 opened this issue May 9, 2024 · 3 comments
Open
7 of 21 tasks

Open Source auto-bleeper for podcasts and songs #381

ppeters0502 opened this issue May 9, 2024 · 3 comments
Labels
AI/ML Artificial Intelligence and Machine Learning. Including, but not limited to, creating Skynet. APIs/Backend Like getting feature requests from the frontend team? Look no further! Extension/Plugin/Add-on Extend a product you enjoy, and make it even better! Intermediate Projects that require a medium level of understanding. Doesn't require much prior knowledge. Mobile app Ideas that will result in a mobile application. Much work This project takes little time to complete. (ETA several weeks+) Web app Applications on the web. Perhaps with React? Or Vue? Or Angular?

Comments

@ppeters0502
Copy link

ppeters0502 commented May 9, 2024

Project description

Right now my kids are at the age where they start repeating anything they hear, including swear words. Since there are already tons of services for transcribing (relatively accurately) audio into text, I thought it might be useful to use various speech to text services and popular media player APIs to automatically "bleep" out swear words for a given audio stream or file.

Relevant Technology

There are tons of different Speech to Text services that (with a subscription or sometimes per-audio-file fee) can generate a transcript of an audio file and return a JSON file or TXT file with timestamps and the transcribed words.

For "bleeping" the naughty words, we could take two different approaches. If the media player streaming the audio supports downloading the audio file and storing locally, we could use an audio signal processor like FFmpeg to edit the audio file directly and replace the audio of the naughty words with generated signals at the exact timestamps, effectively "bleeping" out the words.

I think it would take more work as far as piping in audio to a Speech to Text service for media streamers like Spotify, but for those media players that don't support downloading the file locally (or like Spotify they encrypt local downloads), We can take a different approach. If they have an accessible API for controlling playback volume (for this example, Spotify does), you could programmatically make calls to the media streamer's API to get the current playback volume, update the playback volume to zero for the duration of the bleeped word, and then reset the playback volume to the original value after the duration of the swear word has passed.

Complexity and required time

Complexity

  • Beginner - This project requires no or little prior knowledge of the technolog(y|ies) specified to contribute to the project
  • Intermediate - The user should have some prior knowledge of the technolog(y|ies) to the point where they know how to use it, but not necessarily all the nooks and crannies of the technology
  • Advanced - The project requires the user to have a good understanding of all components of the project to contribute

Required time (ETA)

  • Little work - A couple of days
  • Medium work - A week or two
  • Much work - The project will take more than a couple of weeks and serious planning is required

I think the initial idea with one media player and one speech to text service wouldn't take a large amount of time. Ideally I think this project would work best if the website/extension/web app supported multiple types of media players and multiple speech to text services, which would take a considerable amount of time.

Categories

  • Mobile app
  • IoT
  • Web app
  • Frontend/UI
  • AI/ML
  • APIs/Backend
  • Voice Assistant
  • Developer Tooling
  • Extension/Plugin/Add-On
  • Design/UX
  • AR/VR
  • Bots
  • Security
  • Blockchain
  • Futuristic Tech/Something Unique
@KaKi87
Copy link

KaKi87 commented May 10, 2024

You're having a so similar XY problem to the previous OP that I'm wondering if you're one and the same 😅

Say you install an app on your kids' devices that bleeps everything everywhere.

What's gonna happen when they listen to music with friends on their devices ? Nothing.

This is an education problem, that shouldn't be attempted to be solved with technical means.

@FredrikAugust FredrikAugust added Much work This project takes little time to complete. (ETA several weeks+) Intermediate Projects that require a medium level of understanding. Doesn't require much prior knowledge. Mobile app Ideas that will result in a mobile application. Web app Applications on the web. Perhaps with React? Or Vue? Or Angular? AI/ML Artificial Intelligence and Machine Learning. Including, but not limited to, creating Skynet. APIs/Backend Like getting feature requests from the frontend team? Look no further! Extension/Plugin/Add-on Extend a product you enjoy, and make it even better! labels May 13, 2024
@ppeters0502
Copy link
Author

While my use case focused on my children (mostly listening to Spotify and podcasts in the car), I was more thinking this was a niche (but still relevant) sort of issue that doesn't seem to have an open-source option. I could potentially see use cases for listening with kids, for listening to music in a corporate setting, or just for people who don't like swear words (haha, like my mother in law!)
Especially since Speech to Text and AI-adjacent projects are picking up steam, I thought this sort of project could be a good starting point, and (depending on interest and development) could have the potential to pick up in several different types of technology, like a browser extension, web app, or mobile app.

If there's no interest in this project I can close the issue, I just felt like your response on kids' devices (which I totally understand by the way!) is only focusing on one specific use case, when I could see this project possibly fitting multiple scenarios.

@a4v2d4
Copy link

a4v2d4 commented Jul 27, 2024

@ppeters0502 I'm interested, I think your use case is actually pretty common.

For text-to-speech, whisper.cpp (https://github.com/ggerganov/whisper.cpp) could be used for free.

Not sure how we could get intercept the audio stream from something like Spotify even if we can mute the volume. I guess we could find the podcast from separate source, and use the same detected timestamps for mute?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AI/ML Artificial Intelligence and Machine Learning. Including, but not limited to, creating Skynet. APIs/Backend Like getting feature requests from the frontend team? Look no further! Extension/Plugin/Add-on Extend a product you enjoy, and make it even better! Intermediate Projects that require a medium level of understanding. Doesn't require much prior knowledge. Mobile app Ideas that will result in a mobile application. Much work This project takes little time to complete. (ETA several weeks+) Web app Applications on the web. Perhaps with React? Or Vue? Or Angular?
Projects
None yet
Development

No branches or pull requests

4 participants