Conversation
…hods This refactoring addresses technical debt by breaking down the 98-line diarize() method into three focused, single-responsibility methods: ## Extracted Methods 1. **_load_audio_for_diarization()** (16 lines) - Loads audio using torchaudio with fallback to file path - Handles in-memory vs file-based loading - Clear error handling with debug logging 2. **_perform_diarization()** (19 lines) - Executes the pyannote pipeline - Converts results to SpeakerSegment objects - Returns both raw diarization and formatted segments - Improved logging with speaker count 3. **_extract_speaker_embeddings()** (66 lines) - Extracts voice embeddings for each speaker - Handles pydub import and audio loading errors gracefully - Processes all segments per speaker - Returns dict of speaker embeddings with detailed logging ## Refactored Main Method The main diarize() method (18 lines) now: - Clearly orchestrates the three-step pipeline - Has improved documentation with numbered steps - Maintains backward compatibility (same API) - Better separation of concerns ## Testing Added comprehensive unit tests (262 lines) covering: - Audio loading with torchaudio success and fallback - Diarization execution with various input types - Embedding extraction success and error cases - Empty segment handling - Import error handling - Partial failure scenarios (one speaker fails) ## Metrics - **Before**: 1 method with 98 lines - **After**: 4 methods averaging ~30 lines each - **Added Tests**: 13 new test cases for extracted methods - **Code Quality**: Improved maintainability and testability - **API**: No breaking changes - fully backward compatible ## Benefits - Each method has a single, clear responsibility - Easier to test individual steps independently - Better error handling and logging throughout - Simpler to debug and extend - Reduced cognitive load when reading code Fixes technical debt identified in refactoring candidate #6.
Incorporates enum refactoring changes from main branch to ensure compatibility with the latest codebase standards.
Summary of ChangesHello @Gambitnl, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the speaker diarization process by modularizing the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request refactors the diarize method in SpeakerDiarizer by breaking it down into smaller, more focused private methods: _load_audio_for_diarization, _perform_diarization, and _extract_speaker_embeddings. This is a great improvement for code readability, maintainability, and testability. The changes are accompanied by a comprehensive new suite of tests for these extracted methods, covering various success and failure scenarios.
My review focuses on further improving the maintainability and robustness of the new code. I've suggested adding missing type hints to the new method signatures, making an exception handler more specific, and refactoring duplicated test setup code into a pytest fixture. Overall, this is a solid refactoring that significantly improves the quality of the diarization logic.
| self.pipeline = None # Ensure it's None on failure | ||
|
|
||
| def diarize(self, audio_path: Path) -> Tuple[List[SpeakerSegment], Dict[str, np.ndarray]]: | ||
| def _load_audio_for_diarization(self, audio_path: Path): |
|
|
||
| return diarization_input | ||
|
|
||
| def _perform_diarization(self, diarization_input) -> Tuple[Any, List[SpeakerSegment]]: |
There was a problem hiding this comment.
| def _extract_speaker_embeddings( | ||
| self, | ||
| audio_path: Path, | ||
| diarization |
| except Exception as exc: | ||
| self.logger.warning( | ||
| "Unable to import pydub for embedding extraction: %s", | ||
| exc | ||
| ) |
There was a problem hiding this comment.
Catching a broad Exception for an import is generally discouraged as it can mask other unexpected errors. It's better to catch the more specific ImportError here to make the code's intent clearer and more robust.
| except Exception as exc: | |
| self.logger.warning( | |
| "Unable to import pydub for embedding extraction: %s", | |
| exc | |
| ) | |
| except ImportError as exc: | |
| self.logger.warning( | |
| "Unable to import pydub for embedding extraction: %s", | |
| exc | |
| ) |
| def test_extract_speaker_embeddings_handles_inference_error(self, MockAudioSegment, diarizer, tmp_path, caplog): | ||
| """Test that embedding extraction continues when inference fails for one speaker.""" | ||
| # Setup: first speaker succeeds, second fails | ||
| mock_embedding_model = MagicMock() | ||
| mock_embedding_model.side_effect = [ | ||
| torch.tensor([[0.1, 0.2]]), # Success for SPEAKER_00 | ||
| RuntimeError("Inference failed"), # Failure for SPEAKER_01 | ||
| ] | ||
| diarizer.embedding_model = mock_embedding_model | ||
|
|
||
| # Mock audio | ||
| mock_audio_chunk = MagicMock() | ||
| mock_audio_chunk.frame_rate = 16000 | ||
| mock_audio_chunk.__len__.return_value = 16000 | ||
| mock_audio_chunk.get_array_of_samples.return_value = np.zeros(16000, dtype=np.int16) | ||
|
|
||
| mock_full_audio = MagicMock() | ||
| mock_full_audio.frame_rate = 16000 | ||
| mock_full_audio.__getitem__ = lambda self, key: mock_audio_chunk | ||
|
|
||
| accumulated_audio = MagicMock() | ||
| accumulated_audio.frame_rate = 16000 | ||
| accumulated_audio.__len__.return_value = 16000 | ||
| accumulated_audio.get_array_of_samples.return_value = np.zeros(16000, dtype=np.int16) | ||
| accumulated_audio.__add__ = lambda self, other: accumulated_audio | ||
| accumulated_audio.__iadd__ = lambda self, other: accumulated_audio | ||
|
|
||
| mock_empty = MagicMock() | ||
| mock_empty.__len__.return_value = 0 | ||
| mock_empty.__add__ = lambda self, other: accumulated_audio | ||
| mock_empty.__iadd__ = lambda self, other: accumulated_audio | ||
|
|
||
| MockAudioSegment.from_wav.return_value = mock_full_audio | ||
| MockAudioSegment.empty.return_value = mock_empty |
There was a problem hiding this comment.
This test has a lot of complex mocking setup for pydub.AudioSegment that is very similar to the setup in test_extract_speaker_embeddings_successful. To reduce duplication and improve maintainability of these tests, consider extracting this common mocking logic into a pytest fixture. The fixture could return the configured MockAudioSegment class or a tuple of configured mocks.
…maintainability Based on code review from gemini-code-assist bot, this commit makes the following improvements: ## Type Hints Enhancement 1. **Added Union to typing imports** - Enables proper type annotation for methods accepting multiple types 2. **_load_audio_for_diarization return type** - Added `-> Union[Dict, str]` return type hint - Clearly indicates the method can return either a dict or string 3. **_perform_diarization parameter type** - Added `diarization_input: Union[Dict, str]` type hint - Makes the dual-mode input explicit in the signature 4. **_extract_speaker_embeddings parameter type** - Added `diarization: Any` type hint - Documents that this accepts pyannote pipeline results ## Exception Handling Improvement 5. **More specific exception catching** - Changed `except Exception` to `except ImportError` for pydub import - Prevents masking of other unexpected errors - Makes error handling intent clearer ## Test Code Quality 6. **Extracted common mock setup to fixture** - Created `mock_audio_segment_setup` pytest fixture - Reduces ~60 lines of duplicated mock configuration - Refactored `test_extract_speaker_embeddings_successful` - Refactored `test_extract_speaker_embeddings_handles_inference_error` - Improves test maintainability and readability ## Benefits - Better static type checking support - Clearer API contracts through type hints - More maintainable and DRY test code - More robust exception handling - Improved code readability All changes maintain backward compatibility.
No description provided.