Release v0.0.7
First stable release + publish to PyPI
What's Changed
see full changelog
- feat(module): initial module by @mratanusarkar in #1
- feat(exp): build py script to gen
.srt->.mp4by @mratanusarkar in #2 - feat(sub2pod): convert exp script into subtitle text to podcast module by @mratanusarkar in #3
- feat(docs): api documentation for
sub2pod+ example usage by @mratanusarkar in #4 - feat(sub2pod): optimize and parallelize video generation engine by @mratanusarkar in #5
- feat(docs): demo example usage tutorial with real world usecase by @mratanusarkar in #6
- feat(docu): minor update readme and docs by @mratanusarkar in #8
- feat(docu): add github ci for docs by @mratanusarkar in #9
- feat(sub2pod): implement effects submodule + re-factor audim API design by @mratanusarkar in #11
- chore(sub2pod): fix docs, lints and formats by @mratanusarkar in #12
- feat(exp): build py script to gen
.mp3->.srt+ playback.mp3+.srtin RT by @mratanusarkar in #13 - feat(sub2pod): srt timestamp normalization + content positioning offset by @mratanusarkar in #14
- feat(aud2sub): convert exp script into audio to subtitle text module by @mratanusarkar in #15
- feat(util): add utility module to playback audio with srt + replace speaker placeholders with names by @mratanusarkar in #17
- feat(docs): api documentation for
aud2sub&utilsmodules + example usage by @mratanusarkar in #19 - feat(util): add utility module to extract audio from video by @mratanusarkar in #24
- feat(sub2pod): add watermark element by @mratanusarkar in #25
- feat(docs): organize docs, introduce devblog, add pre-release blog by @mratanusarkar in #26
- feat(docs): add usage and examples covering various scenarios by @mratanusarkar in #27
- feat(license): add license, attribution & citation by @mratanusarkar in #28
- feat(docs): fix: asset links + add: demo and disclaimer sections by @mratanusarkar in #29
- feat(docs): publish to pypi + improve docs by @mratanusarkar in #30
- fix(docs): broken links in readme by @mratanusarkar in #31
Audim - First Stable Release 🎉
see release notes (short)
Audim (v0.0.7) - First Stable Release
Transform audio podcasts into engaging animated videos with precise programmatic control.
What's New
🎙️ Audio to Video Pipeline
- Automatic Transcription: Convert audio/video files to timestamped subtitles
- Speaker Diarization: Identify and label multiple speakers
- Animated Video Generation: Transform subtitles + audio into professional podcast videos
🎬 Key Features
- Multi-format Support: MP3, M4A, WAV, MP4, MKV, AVI
- Layout System: Customizable scenes with headers, profiles, and text elements
- Effects Engine: Smooth transitions and highlight effects
- Parallel Processing: Optimized rendering for faster video generation
- Watermark System: Built-in branding and attribution
Installation
pip install audimDocumentation
License
Apache 2.0 - Free for personal and commercial use. Please retain the default watermark or add attribution.
Contributing
We welcome contributions! Please check our development guide to get started.
Note: This is our first stable release. While extensively tested, please report any issues you encounter. Your feedback helps us improve!
see full release notes (long)
Audim (v0.0.7) - Audio Podcast Animation Engine
We are excited to announce the first stable release of Audim, a comprehensive animation and video rendering engine designed specifically for creating visually engaging podcast videos from audio-based and voice-based content. This release represents the culmination of extensive development work spanning multiple iterations and represents a significant milestone in programmatic podcast video generation.
🎯 Overview
Audim transforms the landscape of podcast content creation by providing precise programmatic animations and video rendering capabilities for audio-based content. The engine enables creators to convert raw audio recordings into professionally animated podcast videos with sophisticated layout-based scenes, automated subtitle generation, and customizable visual elements.
✨ Core Features
Audio Processing and Transcription
- Audio to Subtitle Generation: Complete audio transcription engine with support for multiple audio formats (.mp3, .m4a, .wav)
- Real-time Processing: Generate subtitles and transcripts from audio/video files with timestamp synchronization
- Speaker Recognition: Advanced speaker identification and placeholder replacement with actual names
- Multi-format Support: Compatible with various video formats (.mp4, .mkv, .avi) for audio extraction
Video Generation and Animation
- Subtitle to Podcast Conversion: Transform subtitle files (.srt) into fully animated podcast videos
- Precise Programmatic Animations: Engine designed for exact frame-level control and smooth transitions
- Layout-based Scene Rendering: Professional video generation with customizable scene compositions
- Parallel Processing: Optimized and parallelized video generation engine for enhanced performance
Visual Elements and Customization
- Watermark Integration: Built-in watermark system for content attribution and branding
- Multiple Layout Options: Flexible layout system supporting various podcast video styles
- Effect System: Comprehensive effects engine including transitions and highlights
- Element Customization: Configurable header, profile, text, and watermark elements
🏗️ Architecture and Modules
Aud2Sub Module
The audio-to-subtitle conversion system provides robust transcription capabilities with support for multiple transcriber backends. This module handles the initial processing of audio content into time-synchronized subtitle files.
Sub2Pod Module
The subtitle-to-podcast conversion engine represents the core animation system, featuring advanced layout management, element positioning, and visual effects. This module includes specialized components for content positioning offset and timestamp normalization.
Utils Module
A comprehensive utility suite offering audio playback capabilities, subtitle processing tools, and video-to-audio extraction functionality. These utilities provide essential support functions for the entire pipeline.
Effects and Transitions
Advanced visual effects system supporting smooth transitions, content highlights, and professional-grade animation sequences. The effects subsystem has been completely refactored to provide enhanced API design and improved performance.
📚 Documentation and Examples
Comprehensive Documentation
- API Documentation: Complete API reference covering all modules and functions
- Usage Examples: Extensive collection of usage scripts covering various real-world scenarios
- Development Blog: Detailed development insights and version progression documentation
- Installation Guide: Step-by-step setup instructions for both users and developers
Example Scripts
The release includes multiple example scripts demonstrating various use cases and implementation patterns. These examples serve as practical starting points for different podcast video generation scenarios.
🚀 Installation and Setup
PyPI Distribution
Audim is now available on PyPI for easy installation and distribution. The package follows standard Python packaging conventions and supports modern Python environments.
Development Environment
Complete development setup instructions are provided for contributors, including proper development environment configuration and contribution guidelines.
📄 Licensing and Attribution
Apache 2.0 License
Audim is released under the Apache 2.0 license, allowing free use for both personal and commercial projects. The license provides broad permissions while maintaining appropriate attribution requirements.
Attribution Guidelines
- Default watermark retention in generated videos
- Optional "Made with Audim" attribution in video descriptions
- Repository linking in project documentation
- Comprehensive attribution guidelines available in the NOTICE file
Citation Support
Academic and research citation formats are provided for users incorporating Audim into scholarly work. The project includes formal citation guidelines accessible through GitHub's citation feature.
⚠️ Important Notes
Development Stage Disclaimer
While this represents the first stable release, Audim continues active development and may contain limitations in diverse usage scenarios. The rendering engine requires ongoing development and testing across various use cases.
API Stability
Users should monitor documentation updates as the API may evolve based on community feedback and usage patterns. We are committed to maintaining backward compatibility while improving functionality.
Community Engagement
We encourage users to try Audim for their podcast video projects, report issues when encountered, and contribute improvements through pull requests. Community feedback is essential for continued development and enhancement.
🔗 Resources
- Documentation: https://mratanusarkar.github.io/audim/
- Usage Examples: https://mratanusarkar.github.io/audim/usage/
- Development Blog: https://mratanusarkar.github.io/audim/devblog/
- Issue Reporting: https://github.com/mratanusarkar/audim/issues
- Contributions: See development documentation for contribution guidelines
This release represents a significant milestone in programmatic podcast video generation, providing creators with powerful tools for transforming audio content into engaging visual experiences. We look forward to seeing the innovative podcast videos created with Audim and welcome community feedback and contributions.
New Contributors
- @mratanusarkar made their first contribution in #1
Full Changelog: https://github.com/mratanusarkar/audim/commits/v0.0.7