-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Overview
Explore integrating Nvidia's Parakeet model via MLX as an additional offline transcription option alongside WhisperKit.
Background
- Parakeet MLX Port: https://github.com/senstella/parakeet-mlx
- Swift Bindings: https://github.com/FluidInference/swift-parakeet-mlx
Parakeet is Nvidia's ASR (Automatic Speech Recognition) model that has been ported to MLX for Apple Silicon. This could provide an alternative local transcription option with potentially different accuracy/performance characteristics compared to WhisperKit.
Investigation Tasks
1. Technical Feasibility
- Review the Swift Parakeet MLX library compatibility with our current stack (Swift 6.0, macOS 14.0+)
- Assess model sizes and download requirements
- Compare performance characteristics with WhisperKit (speed, accuracy, memory usage)
- Verify Apple Silicon compatibility and requirements
2. Architecture Integration
- Determine how to implement
TranscriptionServiceprotocol for Parakeet - Identify any conflicts or integration challenges with existing WhisperKit service
- Plan model download and management workflow (similar to WhisperKit)
- Consider shared MLX infrastructure between Parakeet transcription and MLX post-processing
3. User Experience
- Design UI for Parakeet model selection in Settings
- Plan user messaging for when to choose Parakeet vs WhisperKit
- Consider download size and storage implications for users
- Determine default model selection strategy
4. Testing Requirements
- Identify test scenarios for accuracy comparison
- Plan performance benchmarking approach
- Consider edge cases (background noise, accents, technical jargon)
Success Criteria
- Parakeet model integrates seamlessly as a third transcription option
- Users can switch between WhisperKit and Parakeet models easily
- Performance metrics documented for both models
- Clear guidance provided for users on which model to choose
Out of Scope
- Replacing WhisperKit entirely (keep both options available)
- Supporting Intel Macs (Parakeet requires MLX/Apple Silicon like current MLX service)
References
- Current WhisperKit implementation:
Services/WhisperKitService.swift - TranscriptionService protocol:
Core/Domain/TranscriptionService.swift - MLX service pattern:
Services/MLXService.swift
Priority: Medium
Effort: Medium-Large (requires research, prototyping, and integration)
Type: Enhancement
Metadata
Metadata
Assignees
Labels
No labels