Audioflare

An all-in-one AI audio playground using Cloudflare AI Workers to transcribe, analyze, summarize, and translate any audio file.

View Demo · Report Bug · Request Feature

Table of Contents

About The Project
Getting Started
Usage
Contributing
License
Contact

About This Project

Audioflare emerged from my side project endeavors at Smol AI, specifically aimed at exploring the capabilities of Cloudflare AI workers. The project demonstrates a practical use case by orchestrating a series of AI workers to process an audio file of up to 30 seconds. Here’s a walkthrough of the core functionality:

Transcription:
- Initially, the audio file is transcribed using Cloudflare's Speech to Text worker, which is built on OpenAI's whisper API.
Summarization:
- The transcribed text is then summarized using Cloudflare's LLM AI worker, based on Meta's llama-2-7b-chat-int8 model. It's worth noting that the LLM model struggles with lengthy prompts.
Sentiment Analysis:
- Sentiment analysis is performed on the transcribed text using Cloudflare's Text Classification AI worker, leveraging the Huggingface’s distilbert-sst-2-int8 model.
Translation:
- The transcribed text is translated into nine languages using Cloudflare's Translation AI workers, which utilize Meta's m2m100-1.2b model.
Performance Metrics:
- The time taken for each request to be processed is calculated and disclosed, providing insight into the performance metrics.
Observability and Monitoring:
- The Cloudflare AI Gateway is used to add observability and monitoring to the AI workers, including analytics, logging, caching, and rate limiting.

The current setup has its limitations; transcription is confined to 30 seconds, and the LLM model's performance on summarization could be better.

The underlying concept of Audioflare underscores the potential of Cloudflare AI workers by standardizing the AI API request framework, simplifying multi-step AI activities. Although the models in use have limitations and are marked as 'beta' by Cloudflare, there's a clear path toward enhancing this project as more models become available.

Your engagement is encouraged. Feel free to submit pull requests and issues as you experiment with Audioflare. This project is intended to serve as a template for learning and working with Cloudflare AI workers, and while it doesn’t currently include Cloudflare's Image Classification or Text Embedding workers due to their irrelevance to the audio use case, it’s a step towards understanding and utilizing the Cloudflare AI ecosystem better.

As Cloudflare broadens its model support, I look forward to refining Audioflare, making it a more robust and informative template for the developer community.

(back to top)

Demo

(back to top)

Key Features

Audio Processing:
- Users can upload an audio file for processing.
  - Drag and drop a local audio file from their computer.
  - Alternatively, drag and drop one of three pre-provided audio files included on the main page and in this repo.
- Audio files longer than 30 seconds are supported, but only the first 30 seconds will be transcribed.
- Audio transcription is handled by Cloudflare's Speech to Text worker (based on OpenAI's Whisper API).
Text Summarization:
- Transcribed text is summarized using Cloudflare's LLM AI worker (based on Meta's llama-2-7b-chat-int8 model).
Sentiment Analysis:
- Sentiment analysis is performed on the transcribed text using Cloudflare's Text Classification AI worker (based on Huggingface’s distilbert-sst-2-int8 model).
Translation:
- Transcribed text is translated into nine different languages using Cloudflare's Translation AI workers (based on Meta's m2m100-1.2b model).
Performance Metrics:
- Time taken for each request to be processed is calculated and displayed.
Observability and Monitoring:
- Uses Cloudflare AI Gateway to add observability and monitoring to the AI workers:
  - Analytics: View metrics like the number of requests and tokens.
  - Logging: Monitor requests and errors.
  - Caching: Serve requests from Cloudflare’s cache for faster response and cost savings.
  - Rate Limiting: Control application scaling by limiting the number of received requests.
Learning and Exploration:
- Audioflare serves as a template for learning and working with Cloudflare AI workers.
- Users can explore the functionality of different Cloudflare AI workers excluding the Image Classification or Text Embedding workers as they are not integrated due to their irrelevance to the audio use case.

(back to top)

Built With

This project was built in 2023 using the following technologies.

See package.json for a full list of dependencies.

(back to top)

Getting Started

To get a local copy up and running follow these simple steps.

Clone this repository

git clone https://github.com/seanoliver/audioflare.git

Install dependencies
```
 cd audioflare
 bun install
```
Create a Cloudflare account
Install Wrangler and login
```
bun add wrangler --dev
wrangler login
```
Rename .env.example to .env and follow the instructions linked in the comments to find each of the required keys and values.
Run the app
```
 bun dev
```
Go to http://localhost:3000 to check it out

(back to top)

Contributing

This is a great project for learning Cloudflare, AI Workers, and simple Next.js API Routes. Feel free to fork this repo and make it your own. If you have any questions or suggestions, please feel free to contact me!

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

(back to top)

License

Distributed under the MIT License. See LICENSE for more information.

(back to top)

Contact

Your Name - @SeanOliver - helloseanoliver@gmail.com

Project Link: https://github.com/seanoliver/audioflare

Live Demo: https://audioflare.seanoliver.dev/

(back to top)

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
app		app
components		components
lib		lib
public		public
.env.example		.env.example
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bun.lockb		bun.lockb
components.json		components.json
declarations.d.ts		declarations.d.ts
next.config.js		next.config.js
package.json		package.json
postcss.config.js		postcss.config.js
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audioflare

About This Project

Demo

Key Features

Built With

Getting Started

Contributing

License

Contact

About

Releases

Packages

Languages

License

seanoliver/audioflare

Folders and files

Latest commit

History

Repository files navigation

Audioflare

About This Project

Demo

Key Features

Built With

Getting Started

Contributing

License

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages