Skip to content

Claude/refactor diarizer extraction 011 cv4 b jrcp gzf cz4c he hdb u#21

Merged
Gambitnl merged 3 commits intomainfrom
claude/refactor-diarizer-extraction-011CV4BJrcpGzfCZ4cHeHdbU
Nov 12, 2025
Merged

Claude/refactor diarizer extraction 011 cv4 b jrcp gzf cz4c he hdb u#21
Gambitnl merged 3 commits intomainfrom
claude/refactor-diarizer-extraction-011CV4BJrcpGzfCZ4cHeHdbU

Conversation

@Gambitnl
Copy link
Copy Markdown
Owner

No description provided.

…hods

This refactoring addresses technical debt by breaking down the 98-line
diarize() method into three focused, single-responsibility methods:

## Extracted Methods

1. **_load_audio_for_diarization()** (16 lines)
   - Loads audio using torchaudio with fallback to file path
   - Handles in-memory vs file-based loading
   - Clear error handling with debug logging

2. **_perform_diarization()** (19 lines)
   - Executes the pyannote pipeline
   - Converts results to SpeakerSegment objects
   - Returns both raw diarization and formatted segments
   - Improved logging with speaker count

3. **_extract_speaker_embeddings()** (66 lines)
   - Extracts voice embeddings for each speaker
   - Handles pydub import and audio loading errors gracefully
   - Processes all segments per speaker
   - Returns dict of speaker embeddings with detailed logging

## Refactored Main Method

The main diarize() method (18 lines) now:
- Clearly orchestrates the three-step pipeline
- Has improved documentation with numbered steps
- Maintains backward compatibility (same API)
- Better separation of concerns

## Testing

Added comprehensive unit tests (262 lines) covering:
- Audio loading with torchaudio success and fallback
- Diarization execution with various input types
- Embedding extraction success and error cases
- Empty segment handling
- Import error handling
- Partial failure scenarios (one speaker fails)

## Metrics

- **Before**: 1 method with 98 lines
- **After**: 4 methods averaging ~30 lines each
- **Added Tests**: 13 new test cases for extracted methods
- **Code Quality**: Improved maintainability and testability
- **API**: No breaking changes - fully backward compatible

## Benefits

- Each method has a single, clear responsibility
- Easier to test individual steps independently
- Better error handling and logging throughout
- Simpler to debug and extend
- Reduced cognitive load when reading code

Fixes technical debt identified in refactoring candidate #6.
Incorporates enum refactoring changes from main branch to ensure
compatibility with the latest codebase standards.
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @Gambitnl, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the speaker diarization process by modularizing the diarize method into several private helper functions. This change improves code readability, maintainability, and testability by separating concerns such as audio loading, diarization execution, and speaker embedding extraction. The refactoring also enhances error handling and provides more detailed logging for various stages of the diarization pipeline, making the overall system more robust.

Highlights

  • Refactored Diarization Logic: The core diarize method in src/diarizer.py has been broken down into three smaller, more focused private methods to improve modularity and readability.
  • Improved Audio Loading Strategy: A new _load_audio_for_diarization method has been introduced to handle audio loading, prioritizing in-memory processing using torchaudio and providing a robust fallback to file-based loading if torchaudio is unavailable or fails.
  • Dedicated Diarization Execution: The process of executing the diarization pipeline and converting its raw output into structured SpeakerSegment objects is now encapsulated within a new _perform_diarization method.
  • Encapsulated Embedding Extraction with Enhanced Error Handling: The logic for extracting speaker embeddings has been moved to _extract_speaker_embeddings, which includes comprehensive error handling for pydub imports, audio loading issues, and individual speaker inference failures, ensuring the process is more resilient.
  • Comprehensive Unit Test Coverage: A new test class, TestExtractedMethods, has been added to tests/test_diarizer.py to provide thorough unit test coverage for all the newly extracted private methods, validating their functionality and error handling capabilities.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the diarize method in SpeakerDiarizer by breaking it down into smaller, more focused private methods: _load_audio_for_diarization, _perform_diarization, and _extract_speaker_embeddings. This is a great improvement for code readability, maintainability, and testability. The changes are accompanied by a comprehensive new suite of tests for these extracted methods, covering various success and failure scenarios.

My review focuses on further improving the maintainability and robustness of the new code. I've suggested adding missing type hints to the new method signatures, making an exception handler more specific, and refactoring duplicated test setup code into a pytest fixture. Overall, this is a solid refactoring that significantly improves the quality of the diarization logic.

Comment thread src/diarizer.py Outdated
self.pipeline = None # Ensure it's None on failure

def diarize(self, audio_path: Path) -> Tuple[List[SpeakerSegment], Dict[str, np.ndarray]]:
def _load_audio_for_diarization(self, audio_path: Path):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This function is missing a return type hint. Adding one would improve code clarity and allow for better static analysis. Based on the function's logic and docstring, the return type should be Union[Dict, str]. This would require importing Union from typing.

Comment thread src/diarizer.py Outdated

return diarization_input

def _perform_diarization(self, diarization_input) -> Tuple[Any, List[SpeakerSegment]]:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The diarization_input parameter is missing a type hint. To improve readability and type safety, you should specify its type. Based on the return type of _load_audio_for_diarization, this should be Union[Dict, str]. This would require importing Union from typing.

Comment thread src/diarizer.py Outdated
def _extract_speaker_embeddings(
self,
audio_path: Path,
diarization
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The diarization parameter is missing a type hint. Adding Any would make the signature more explicit about what it accepts, improving readability.

Suggested change
diarization
diarization: Any

Comment thread src/diarizer.py Outdated
Comment on lines +458 to +462
except Exception as exc:
self.logger.warning(
"Unable to import pydub for embedding extraction: %s",
exc
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Catching a broad Exception for an import is generally discouraged as it can mask other unexpected errors. It's better to catch the more specific ImportError here to make the code's intent clearer and more robust.

Suggested change
except Exception as exc:
self.logger.warning(
"Unable to import pydub for embedding extraction: %s",
exc
)
except ImportError as exc:
self.logger.warning(
"Unable to import pydub for embedding extraction: %s",
exc
)

Comment thread tests/test_diarizer.py Outdated
Comment on lines +550 to +583
def test_extract_speaker_embeddings_handles_inference_error(self, MockAudioSegment, diarizer, tmp_path, caplog):
"""Test that embedding extraction continues when inference fails for one speaker."""
# Setup: first speaker succeeds, second fails
mock_embedding_model = MagicMock()
mock_embedding_model.side_effect = [
torch.tensor([[0.1, 0.2]]), # Success for SPEAKER_00
RuntimeError("Inference failed"), # Failure for SPEAKER_01
]
diarizer.embedding_model = mock_embedding_model

# Mock audio
mock_audio_chunk = MagicMock()
mock_audio_chunk.frame_rate = 16000
mock_audio_chunk.__len__.return_value = 16000
mock_audio_chunk.get_array_of_samples.return_value = np.zeros(16000, dtype=np.int16)

mock_full_audio = MagicMock()
mock_full_audio.frame_rate = 16000
mock_full_audio.__getitem__ = lambda self, key: mock_audio_chunk

accumulated_audio = MagicMock()
accumulated_audio.frame_rate = 16000
accumulated_audio.__len__.return_value = 16000
accumulated_audio.get_array_of_samples.return_value = np.zeros(16000, dtype=np.int16)
accumulated_audio.__add__ = lambda self, other: accumulated_audio
accumulated_audio.__iadd__ = lambda self, other: accumulated_audio

mock_empty = MagicMock()
mock_empty.__len__.return_value = 0
mock_empty.__add__ = lambda self, other: accumulated_audio
mock_empty.__iadd__ = lambda self, other: accumulated_audio

MockAudioSegment.from_wav.return_value = mock_full_audio
MockAudioSegment.empty.return_value = mock_empty
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This test has a lot of complex mocking setup for pydub.AudioSegment that is very similar to the setup in test_extract_speaker_embeddings_successful. To reduce duplication and improve maintainability of these tests, consider extracting this common mocking logic into a pytest fixture. The fixture could return the configured MockAudioSegment class or a tuple of configured mocks.

…maintainability

Based on code review from gemini-code-assist bot, this commit makes the
following improvements:

## Type Hints Enhancement

1. **Added Union to typing imports**
   - Enables proper type annotation for methods accepting multiple types

2. **_load_audio_for_diarization return type**
   - Added `-> Union[Dict, str]` return type hint
   - Clearly indicates the method can return either a dict or string

3. **_perform_diarization parameter type**
   - Added `diarization_input: Union[Dict, str]` type hint
   - Makes the dual-mode input explicit in the signature

4. **_extract_speaker_embeddings parameter type**
   - Added `diarization: Any` type hint
   - Documents that this accepts pyannote pipeline results

## Exception Handling Improvement

5. **More specific exception catching**
   - Changed `except Exception` to `except ImportError` for pydub import
   - Prevents masking of other unexpected errors
   - Makes error handling intent clearer

## Test Code Quality

6. **Extracted common mock setup to fixture**
   - Created `mock_audio_segment_setup` pytest fixture
   - Reduces ~60 lines of duplicated mock configuration
   - Refactored `test_extract_speaker_embeddings_successful`
   - Refactored `test_extract_speaker_embeddings_handles_inference_error`
   - Improves test maintainability and readability

## Benefits

- Better static type checking support
- Clearer API contracts through type hints
- More maintainable and DRY test code
- More robust exception handling
- Improved code readability

All changes maintain backward compatibility.
@Gambitnl Gambitnl merged commit 44e83d0 into main Nov 12, 2025
@Gambitnl Gambitnl deleted the claude/refactor-diarizer-extraction-011CV4BJrcpGzfCZ4cHeHdbU branch November 12, 2025 22:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants