A powerful web application that combines advanced audio transcription with intelligent AI-powered text processing, all running seamlessly in your browser. Transform spoken content into polished documents with cutting-edge AI technology.
π Live Demo - No installation required! You will need your own AI API keys.
https://jarchitect.org/whisper
- π€ Audio Transcription - Convert speech to text using OpenAI's advanced Whisper technology with exceptional accuracy and language support
- π AI Proofreading - Automatically correct grammar, spelling, and punctuation errors while preserving your original meaning and tone
- π Meeting Minutes - Transform raw conversation transcripts into professional, structured meeting minutes with action items and key decisions
- π Smart Summarization - Extract essential points and create concise summaries from lengthy transcripts and documents
- π Outline Generation - Create well-structured outlines and organize unstructured content into logical, hierarchical formats
This application leverages state-of-the-art AI technologies to deliver accurate, reliable results:
- OpenAI Whisper - Advanced speech recognition
- OpenAI Compatible Models - Flexible AI text processing
- LocalStorage - Secure client-side data storage
- Browser APIs - Native web technologies
All processing occurs securely through trusted AI APIs. Your sensitive data is never stored on our servers, ensuring complete privacy and security.
To use this application, you'll need API keys from the following services:
- Firework - For audio transcription
- OpenRouter - For text processing
-
Create Accounts
- Sign up for Firework
- Sign up for OpenRouter
-
Purchase Credits
- Add a minimum of $5.00 to each service
-
Generate API Keys
- Create API keys from each service's dashboard
- Keep these keys secure and never share them
-
Configure Application
- Open the application in your browser
- Navigate to settings
- Enter your API keys
Firework Whisper V3 Turbo (Recommended)
- Cost Effective: $0.0009 per minute (standard) or $0.00126 per minute (with speaker diarization)
- Privacy First: Zero data retention policy by default
- High Accuracy: State-of-the-art speech recognition technology
- Language Support: Multiple languages and dialects
OpenRouter Models (Recommended)
- Privacy Protected: No storage of prompts or responses
- Model Variety: Access to multiple AI models
- Competitive Pricing: Transparent, usage-based pricing
Note: While any OpenAI-compatible model can be used, be aware that many free AI services use your data for model training. The recommended services prioritize data privacy.
Your audio files and transcripts are processed directly in your browser or sent securely to AI APIs. No data passes through or is stored on our servers.
Your API keys are stored locally in your browser's secure storage and are only used for direct API requests. We never have access to your keys.
This application uses localStorage to remember your preferences and settings. No tracking cookies or analytics are used.
The complete source code is available for review in this repository. This ensures full transparency about how your data is handled and processed.
-
Clone the repository
git clone https://github.com/jarchitect1/simple_transcriber.git cd simple_transcriber -
Open in browser
- Simply open
index.htmlin your web browser
- Simply open
-
Configure API Keys
- Open the application
- Go to settings
- Enter your API keys
simple_transcriber/
βββ index.html # Main application page
βββ settings.html # Settings page
βββ about.html # About page
βββ README.md # This file
- Journalists: Transcribe interviews and create polished articles
- Students: Process lecture recordings into organized notes
- Business Professionals: Generate meeting minutes and summaries
- Content Creators: Transform audio content into written materials
- Researchers: Analyze recorded interviews and discussions
- Accessibility: Convert audio content for hearing-impaired users
We welcome contributions!
- Yet to think of any.
Found a bug or have an idea for improvement? Please:
- Check existing issues to avoid duplicates
- Create a new issue with detailed information
- Use appropriate labels (bug, enhancement, question)
- Provide steps to reproduce for bugs
- Include system information (browser, OS, etc.)
Need help or have questions?
- Email: smartygab24@gmail.com
- GitHub Issues: Use the Issues tab above
- Response Time: I typically respond within 24 hours
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI for the Whisper technology
- Firework for providing cost-effective transcription services
- OpenRouter for privacy-focused AI model access
- Contributors who help improve this project
- Mainly with DeepSeek R1, Gemini 2.5 Pro & Claude Sonnet 4
Version: 1.0.0
Last Updated: July 2025
β Star this repository if you find it helpful!
π Share with others who might benefit from this tool!