REST API (with frontend) to remove ad segments from audio/video files using Azure hosted LLMs and Text-To-Speech (TTS).
A transcript is made with azure AI Speech, which returns an entire transcription, and also word level timestamps, then the entire transcription is sent to an LLM from Azure AI Foundry to extract the entire advertisement segments, then the start_time and end_time of each segment is used alongside Azure Media Services to remove those segments from the original audio file, then return the cleaned audio file to the user.
- Clone the repository
git clone https://github.com/nocdn/ad-segment-trimmer-azure.git- Copy the
.env.examplefile to.envand fill in the required values
cp .env.example .env- To run the backend, navigate to the
backenddirectory and run the following commands:
cd backend
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python3 api.py(the api should now be running on localhost:7070)
To use the API you can use the following curl command:
curl -F "file=@audio.mp3" -OJ http://localhost:7070/process(the -OJ flag will save the file with the name returned by the API, in this case, with a _edited suffix)
- To run the frontend, navigate to the
frontenddirectory and run the following commands:
cd frontend
npm install
npm run dev --openThis project is licensed under the MIT License - see the LICENSE file for details.