# cjm-transcription-plugin-system

> A flexible plugin system for audio transcription intended to make it easy to add support for multiple backends.

## Install

```bash
pip install cjm_transcription_plugin_system
```

## Project Structure

```
nbs/
├── core.ipynb             # Core data structures for audio transcription
└── plugin_interface.ipynb # Domain-specific plugin interface for audio transcription plugins
```

Total: 2 notebooks

## Module Dependencies

```mermaid
graph LR
    core[core<br/>core]
    plugin_interface[plugin_interface<br/>Transcription Plugin Interface]

    plugin_interface --> core
```

*1 cross-module dependencies detected*

## CLI Reference

No CLI commands found in this project.

## Module Overview

Detailed documentation for each module in the project:

### core (`core.ipynb`)
> Core data structures for audio transcription

#### Import

```python
from cjm_transcription_plugin_system.core import (
    AudioData,
    TranscriptionResult
)
```
#### Classes

```python
@dataclass
class AudioData:
    "Container for audio data and metadata."
    
    samples: np.ndarray  # Audio sample data as a numpy array
    sample_rate: int  # Sample rate in Hz (e.g., 16000, 44100)
    duration: float  # Duration of the audio in seconds
    filepath: Optional[Path]  # Audio file path
    metadata: Dict[str, Any] = field(...)  # Additional metadata
```

```python
@dataclass
class TranscriptionResult:
    "Standardized transcription output."
    
    text: str  # The transcribed text
    confidence: Optional[float]  # Overall confidence score (0.0 to 1.0)
    segments: Optional[List[Dict]] = field(...)  # List of transcription segments with timestamps and text
    metadata: Optional[Dict] = field(...)  # Transcription metadata
```


### Transcription Plugin Interface (`plugin_interface.ipynb`)
> Domain-specific plugin interface for audio transcription plugins

#### Import

```python
from cjm_transcription_plugin_system.plugin_interface import (
    TranscriptionPlugin
)
```
#### Classes

```python
class TranscriptionPlugin(PluginInterface):
    """
    Transcription-specific plugin interface.
    
    This extends the generic PluginInterface with transcription-specific
    requirements like supported audio formats and the execute signature.
    
    All transcription plugins must implement this interface.
    """
    
    def supported_formats(
            self
        ) -> List[str]:  # List of file extensions this plugin can process
        "List of supported audio formats (e.g., ['wav', 'mp3']).

Returns:
    List of file extensions without the dot (e.g., ['wav', 'mp3', 'flac'])"
    
    def execute(
            self,
            audio: Union[AudioData, str, Path],  # Audio data or path to audio file
            **kwargs  # Additional plugin-specific parameters
        ) -> TranscriptionResult:  # Transcription result with text and metadata
        "Transcribe audio to text.

Args:
    audio: Audio data (AudioData object), file path (str), or Path object
    **kwargs: Additional plugin-specific parameters (e.g., language, model)
    
Returns:
    TranscriptionResult containing transcribed text, confidence, segments, and metadata"
```
