Transform documents into natural-sounding audio summaries using Azure OpenAI's GPT-4O and GPT-4O-Audio-Preview capabilities.
- π Support for PDF, TXT, DOC, and DOCX files
- π Multiple processing methods (Vision Only or Text + Vision)
- π― Customizable processing goals
- β±οΈ Adjustable summary lengths
- π Multiple language support with flags
- π£οΈ Various voice options and tones
- π¨ Modern, responsive UI with intuitive controls
- πΎ PostgreSQL database for reliable history storage
- π Rerun capability with modified settings
- ποΈ Easy deletion of history entries
- π± Mobile-friendly design with glass morphism effects
The application now includes a robust history management system:
- View all processed documents with timestamps
- See processing settings for each entry
- Rerun previous entries with new settings
- Delete individual entries with confirmation
- Clear visual feedback for selected entries
- Automatic audio player updates when selecting entries
- Docker and Docker Compose (recommended)
- Or for local development:
- Python 3.8 or higher
- Poppler (required for PDF processing)
- PostgreSQL 15 or higher
- Azure OpenAI API access
- Azure OpenAI API key and endpoint
-
Clone the repository:
git clone <repository-url> cd <repository-directory>
-
Set up environment variables: Create a file named
keys.env
in the project root:AZURE_OPENAI_ENDPOINT=your_endpoint_here AZURE_OPENAI_API_KEY=your_api_key_here
-
Configure model settings: If needed, edit
config.py
in the project root with the names of your Azure OpenAI model deployments:# Model Deployments AZURE_MODELS = { 'text': os.getenv('AZURE_OPENAI_TEXT_DEPLOYMENT', 'gpt-4o'), 'audio': os.getenv('AZURE_OPENAI_AUDIO_DEPLOYMENT', 'gpt-4o-audio-preview'), }
-
Start the application:
# Default setup (port 5001) docker-compose up -d # If port 5001 is in use, you can override it: PORT=8080 docker-compose up -d
-
Access the application:
- Default: http://localhost:5001
- Or if you changed the port: http://localhost:YOUR_PORT
-
View logs:
docker-compose logs -f
-
Stop the application:
docker-compose down # Keeps the database data docker-compose down -v # Removes all data (fresh start)
-
Install Poppler:
-
macOS:
brew install poppler
-
Ubuntu/Debian:
sudo apt-get install poppler-utils
-
Windows:
- Download from poppler-windows
- Extract and add
bin
directory to PATH
-
-
Install PostgreSQL:
- PostgreSQL Downloads
- Create a database named
aoai_audio
-
Create and activate virtual environment:
python -m venv venv # Windows .\venv\Scripts\activate # macOS/Linux source venv/bin/activate
-
Install dependencies:
pip install -r requirements.txt
-
Configure environment: Create
keys.env
with your Azure OpenAI credentials and database URL:AZURE_OPENAI_ENDPOINT=your_endpoint_here AZURE_OPENAI_API_KEY=your_api_key_here DATABASE_URL=postgresql://postgres:postgres@localhost:5432/aoai_audio
-
Run the application:
python -m flask run --host=0.0.0.0 --port=5001
If port 5001 is already in use:
-
Docker setup:
# Use a different port (e.g., 8080) PORT=8080 docker-compose up -d
-
Local setup:
# Use a different port (e.g., 8080) python -m flask run --host=0.0.0.0 --port=8080
- Ensure PostgreSQL is running
- Check connection string in
keys.env
- For Docker:
docker-compose logs db
to check database logs
-
Environment Variables:
- Never commit
keys.env
(it's in .gitignore) - Use strong passwords in production
- Never commit
-
Data Persistence:
- Database data persists in Docker volume
postgres_data
- Backup regularly in production
- Database data persists in Docker volume
-
Security:
- Change default PostgreSQL password in production
- Use SSL in production
- Keep dependencies updated
-
Monitoring:
- Check application logs:
docker-compose logs -f web
- Check database logs:
docker-compose logs -f db
- Check application logs: