Open Source auto-bleeper for podcasts and songs #381

ppeters0502 · 2024-05-09T14:21:52Z

Project description

Right now my kids are at the age where they start repeating anything they hear, including swear words. Since there are already tons of services for transcribing (relatively accurately) audio into text, I thought it might be useful to use various speech to text services and popular media player APIs to automatically "bleep" out swear words for a given audio stream or file.

Relevant Technology

There are tons of different Speech to Text services that (with a subscription or sometimes per-audio-file fee) can generate a transcript of an audio file and return a JSON file or TXT file with timestamps and the transcribed words.

For "bleeping" the naughty words, we could take two different approaches. If the media player streaming the audio supports downloading the audio file and storing locally, we could use an audio signal processor like FFmpeg to edit the audio file directly and replace the audio of the naughty words with generated signals at the exact timestamps, effectively "bleeping" out the words.

I think it would take more work as far as piping in audio to a Speech to Text service for media streamers like Spotify, but for those media players that don't support downloading the file locally (or like Spotify they encrypt local downloads), We can take a different approach. If they have an accessible API for controlling playback volume (for this example, Spotify does), you could programmatically make calls to the media streamer's API to get the current playback volume, update the playback volume to zero for the duration of the bleeped word, and then reset the playback volume to the original value after the duration of the swear word has passed.

Complexity and required time

Complexity

Beginner - This project requires no or little prior knowledge of the technolog(y|ies) specified to contribute to the project
Intermediate - The user should have some prior knowledge of the technolog(y|ies) to the point where they know how to use it, but not necessarily all the nooks and crannies of the technology
Advanced - The project requires the user to have a good understanding of all components of the project to contribute

Required time (ETA)

Little work - A couple of days
Medium work - A week or two
Much work - The project will take more than a couple of weeks and serious planning is required

I think the initial idea with one media player and one speech to text service wouldn't take a large amount of time. Ideally I think this project would work best if the website/extension/web app supported multiple types of media players and multiple speech to text services, which would take a considerable amount of time.

Categories

KaKi87 · 2024-05-10T12:00:29Z

You're having a so similar XY problem to the previous OP that I'm wondering if you're one and the same 😅

Say you install an app on your kids' devices that bleeps everything everywhere.

What's gonna happen when they listen to music with friends on their devices ? Nothing.

This is an education problem, that shouldn't be attempted to be solved with technical means.

ppeters0502 · 2024-05-14T14:18:35Z

While my use case focused on my children (mostly listening to Spotify and podcasts in the car), I was more thinking this was a niche (but still relevant) sort of issue that doesn't seem to have an open-source option. I could potentially see use cases for listening with kids, for listening to music in a corporate setting, or just for people who don't like swear words (haha, like my mother in law!)
Especially since Speech to Text and AI-adjacent projects are picking up steam, I thought this sort of project could be a good starting point, and (depending on interest and development) could have the potential to pick up in several different types of technology, like a browser extension, web app, or mobile app.

If there's no interest in this project I can close the issue, I just felt like your response on kids' devices (which I totally understand by the way!) is only focusing on one specific use case, when I could see this project possibly fitting multiple scenarios.

a4v2d4 · 2024-07-27T04:53:18Z

@ppeters0502 I'm interested, I think your use case is actually pretty common.

For text-to-speech, whisper.cpp (https://github.com/ggerganov/whisper.cpp) could be used for free.

Not sure how we could get intercept the audio stream from something like Spotify even if we can mute the volume. I guess we could find the podcast from separate source, and use the same detected timestamps for mute?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Open Source auto-bleeper for podcasts and songs #381

Open Source auto-bleeper for podcasts and songs #381

ppeters0502 commented May 9, 2024 •

edited

Loading

KaKi87 commented May 10, 2024

ppeters0502 commented May 14, 2024

a4v2d4 commented Jul 27, 2024

Open Source auto-bleeper for podcasts and songs #381

Open Source auto-bleeper for podcasts and songs #381

Comments

ppeters0502 commented May 9, 2024 • edited Loading

Project description

Relevant Technology

Complexity and required time

Complexity

Required time (ETA)

Categories

KaKi87 commented May 10, 2024

ppeters0502 commented May 14, 2024

a4v2d4 commented Jul 27, 2024

ppeters0502 commented May 9, 2024 •

edited

Loading