Release v0.0.7

Release `v0.0.7`

First stable release + publish to PyPI

What's Changed

see full changelog

feat(module): initial module by @mratanusarkar in #1
feat(exp): build py script to gen .srt -> .mp4 by @mratanusarkar in #2
feat(sub2pod): convert exp script into subtitle text to podcast module by @mratanusarkar in #3
feat(docs): api documentation for sub2pod + example usage by @mratanusarkar in #4
feat(sub2pod): optimize and parallelize video generation engine by @mratanusarkar in #5
feat(docs): demo example usage tutorial with real world usecase by @mratanusarkar in #6
feat(docu): minor update readme and docs by @mratanusarkar in #8
feat(docu): add github ci for docs by @mratanusarkar in #9
feat(sub2pod): implement effects submodule + re-factor audim API design by @mratanusarkar in #11
chore(sub2pod): fix docs, lints and formats by @mratanusarkar in #12
feat(exp): build py script to gen .mp3 -> .srt + playback .mp3 + .srt in RT by @mratanusarkar in #13
feat(sub2pod): srt timestamp normalization + content positioning offset by @mratanusarkar in #14
feat(aud2sub): convert exp script into audio to subtitle text module by @mratanusarkar in #15
feat(util): add utility module to playback audio with srt + replace speaker placeholders with names by @mratanusarkar in #17
feat(docs): api documentation for aud2sub & utils modules + example usage by @mratanusarkar in #19
feat(util): add utility module to extract audio from video by @mratanusarkar in #24
feat(sub2pod): add watermark element by @mratanusarkar in #25
feat(docs): organize docs, introduce devblog, add pre-release blog by @mratanusarkar in #26
feat(docs): add usage and examples covering various scenarios by @mratanusarkar in #27
feat(license): add license, attribution & citation by @mratanusarkar in #28
feat(docs): fix: asset links + add: demo and disclaimer sections by @mratanusarkar in #29
feat(docs): publish to pypi + improve docs by @mratanusarkar in #30
fix(docs): broken links in readme by @mratanusarkar in #31

Audim - First Stable Release 🎉

see release notes (short)

Audim (`v0.0.7`) - First Stable Release

Transform audio podcasts into engaging animated videos with precise programmatic control.

What's New

🎙️ Audio to Video Pipeline

Automatic Transcription: Convert audio/video files to timestamped subtitles
Speaker Diarization: Identify and label multiple speakers
Animated Video Generation: Transform subtitles + audio into professional podcast videos

🎬 Key Features

Multi-format Support: MP3, M4A, WAV, MP4, MKV, AVI
Layout System: Customizable scenes with headers, profiles, and text elements
Effects Engine: Smooth transitions and highlight effects
Parallel Processing: Optimized rendering for faster video generation
Watermark System: Built-in branding and attribution

Installation

pip install audim

Documentation

License

Apache 2.0 - Free for personal and commercial use. Please retain the default watermark or add attribution.

Contributing

We welcome contributions! Please check our development guide to get started.

Note: This is our first stable release. While extensively tested, please report any issues you encounter. Your feedback helps us improve!

Report Issues | View Examples | Join Discussion

see full release notes (long)

Audim (`v0.0.7`) - Audio Podcast Animation Engine

We are excited to announce the first stable release of Audim, a comprehensive animation and video rendering engine designed specifically for creating visually engaging podcast videos from audio-based and voice-based content. This release represents the culmination of extensive development work spanning multiple iterations and represents a significant milestone in programmatic podcast video generation.

🎯 Overview

Audim transforms the landscape of podcast content creation by providing precise programmatic animations and video rendering capabilities for audio-based content. The engine enables creators to convert raw audio recordings into professionally animated podcast videos with sophisticated layout-based scenes, automated subtitle generation, and customizable visual elements.

✨ Core Features

Audio Processing and Transcription

Audio to Subtitle Generation: Complete audio transcription engine with support for multiple audio formats (.mp3, .m4a, .wav)
Real-time Processing: Generate subtitles and transcripts from audio/video files with timestamp synchronization
Speaker Recognition: Advanced speaker identification and placeholder replacement with actual names
Multi-format Support: Compatible with various video formats (.mp4, .mkv, .avi) for audio extraction

Video Generation and Animation

Subtitle to Podcast Conversion: Transform subtitle files (.srt) into fully animated podcast videos
Precise Programmatic Animations: Engine designed for exact frame-level control and smooth transitions
Layout-based Scene Rendering: Professional video generation with customizable scene compositions
Parallel Processing: Optimized and parallelized video generation engine for enhanced performance

Visual Elements and Customization

Watermark Integration: Built-in watermark system for content attribution and branding
Multiple Layout Options: Flexible layout system supporting various podcast video styles
Effect System: Comprehensive effects engine including transitions and highlights
Element Customization: Configurable header, profile, text, and watermark elements

🏗️ Architecture and Modules

Aud2Sub Module

The audio-to-subtitle conversion system provides robust transcription capabilities with support for multiple transcriber backends. This module handles the initial processing of audio content into time-synchronized subtitle files.

Sub2Pod Module

The subtitle-to-podcast conversion engine represents the core animation system, featuring advanced layout management, element positioning, and visual effects. This module includes specialized components for content positioning offset and timestamp normalization.

Utils Module

A comprehensive utility suite offering audio playback capabilities, subtitle processing tools, and video-to-audio extraction functionality. These utilities provide essential support functions for the entire pipeline.

Effects and Transitions

Advanced visual effects system supporting smooth transitions, content highlights, and professional-grade animation sequences. The effects subsystem has been completely refactored to provide enhanced API design and improved performance.

📚 Documentation and Examples

Comprehensive Documentation

API Documentation: Complete API reference covering all modules and functions
Usage Examples: Extensive collection of usage scripts covering various real-world scenarios
Development Blog: Detailed development insights and version progression documentation
Installation Guide: Step-by-step setup instructions for both users and developers

Example Scripts

The release includes multiple example scripts demonstrating various use cases and implementation patterns. These examples serve as practical starting points for different podcast video generation scenarios.

🚀 Installation and Setup

PyPI Distribution

Audim is now available on PyPI for easy installation and distribution. The package follows standard Python packaging conventions and supports modern Python environments.

Development Environment

Complete development setup instructions are provided for contributors, including proper development environment configuration and contribution guidelines.

📄 Licensing and Attribution

Apache 2.0 License

Audim is released under the Apache 2.0 license, allowing free use for both personal and commercial projects. The license provides broad permissions while maintaining appropriate attribution requirements.

Attribution Guidelines

Default watermark retention in generated videos
Optional "Made with Audim" attribution in video descriptions
Repository linking in project documentation
Comprehensive attribution guidelines available in the NOTICE file

Citation Support

Academic and research citation formats are provided for users incorporating Audim into scholarly work. The project includes formal citation guidelines accessible through GitHub's citation feature.

⚠️ Important Notes

Development Stage Disclaimer

While this represents the first stable release, Audim continues active development and may contain limitations in diverse usage scenarios. The rendering engine requires ongoing development and testing across various use cases.

API Stability

Users should monitor documentation updates as the API may evolve based on community feedback and usage patterns. We are committed to maintaining backward compatibility while improving functionality.

Community Engagement

We encourage users to try Audim for their podcast video projects, report issues when encountered, and contribute improvements through pull requests. Community feedback is essential for continued development and enhancement.

🔗 Resources

Documentation: https://mratanusarkar.github.io/audim/
Usage Examples: https://mratanusarkar.github.io/audim/usage/
Development Blog: https://mratanusarkar.github.io/audim/devblog/
Issue Reporting: https://github.com/mratanusarkar/audim/issues
Contributions: See development documentation for contribution guidelines

This release represents a significant milestone in programmatic podcast video generation, providing creators with powerful tools for transforming audio content into engaging visual experiences. We look forward to seeing the innovative podcast videos created with Audim and welcome community feedback and contributions.

New Contributors

@mratanusarkar made their first contribution in #1

Full Changelog: https://github.com/mratanusarkar/audim/commits/v0.0.7

v0.0.7