Conversation
- Clean up stale P3-FEATURE-001 lock (no agent/timestamp) - Add conflict risk ratings (LOW/MEDIUM/HIGH) for all UI bugs - Add effort estimates for better task planning - Group bugs by file/module for easier conflict detection - Add Multi-Session Coordination Guide with examples - Reorganize high/medium/low priority bugs by conflict risk This enables multiple agents to work in parallel on same branch by: - Clearly identifying which files each bug touches - Rating conflict risk for safe task selection - Providing coordination workflow and examples
Summary of ChangesHello @Gambitnl, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a critical enhancement to the campaign dashboard by enabling the filtering of processed sessions based on their associated campaign ID. This ensures that users only see sessions relevant to the currently active campaign, significantly improving data accuracy and user experience. Concurrently, the project's task management documentation has been thoroughly revised to provide clearer guidance for multi-session development and detailed breakdowns of outstanding issues, reflecting the completion of the session filtering task. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request updates the task tracking documentation and implements a bug fix to filter processed sessions by campaign on the campaign dashboard. The code change correctly filters sessions by reading metadata from _data.json files. My review includes one suggestion to improve robustness by handling cases where the JSON metadata file does not contain a dictionary at its root, which could otherwise lead to an unhandled exception.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
- Add Classification enum (IC, OOC, MIXED) with display names - Add ProcessingStatus enum with terminal status checks - Add PipelineStage enum with stage numbers and display names - Add OutputFormat enum with file extension helpers - Add SpeakerLabel class for speaker ID constants - Add ConfidenceDefaults class with value clamping - Add TimeConstants class with time conversion utilities - Add comprehensive test suite with 100% coverage (38 tests) - All enums inherit from str for backward compatibility Part of Refactor #7: Replace Magic Strings with Enums
Classifier updates (classifier.py): - Update ClassificationResult to use Classification enum - Update to_dict/from_dict for enum serialization - Replace magic strings with Classification enum in both classifiers - Replace confidence defaults with ConfidenceDefaults.DEFAULT - Replace confidence clamping with ConfidenceDefaults.clamp() - Add error handling for invalid classification strings - Update tests to use Classification enum Formatter updates (formatter.py): - Use Classification enum for all IC/OOC comparisons - Use OutputFormat enum for return dictionary keys in save_all_formats() - Serialize classification.value in JSON output - Update StatisticsGenerator to use Classification enum All existing tests pass with backward compatibility maintained Part of Refactor #7: Replace Magic Strings with Enums
- Add PipelineStage and ProcessingStatus imports - Replace all 71 occurrences of stage strings with PipelineStage enum: - "audio_converted" → PipelineStage.AUDIO_CONVERTED - "audio_chunked" → PipelineStage.AUDIO_CHUNKED - "audio_transcribed" → PipelineStage.AUDIO_TRANSCRIBED - "transcription_merged" → PipelineStage.TRANSCRIPTION_MERGED - "speaker_diarized" → PipelineStage.SPEAKER_DIARIZED - "segments_classified" → PipelineStage.SEGMENTS_CLASSIFIED - "outputs_generated" → PipelineStage.OUTPUTS_GENERATED - "audio_segments_exported" → PipelineStage.AUDIO_SEGMENTS_EXPORTED - "knowledge_extracted" → PipelineStage.KNOWLEDGE_EXTRACTED - Backward compatibility maintained (enums inherit from str) - Checkpoint system works without changes (accepts str type) Part of Refactor #7: Replace Magic Strings with Enums
- Add SpeakerLabel import - Replace UNKNOWN string with SpeakerLabel.UNKNOWN - Improves type safety and consistency Part of Refactor #7: Replace Magic Strings with Enums
- Document all enum usage patterns - Provide before/after examples - Explain backward compatibility - List all updated files - Include troubleshooting section Part of Refactor #7: Replace Magic Strings with Enums
- Add Classification and TranscriptFilter enums to src/constants.py - Implement format_filtered() method with flexible filtering - Update format_ic_only/ooc_only to delegate to new method - Maintain 100% backward compatibility with existing code Changes: - src/constants.py: New file with Classification and TranscriptFilter enums - src/formatter.py: Added format_filtered() method (80 lines) - src/formatter.py: Refactored format_ic_only() to delegate (11 lines, was 36) - src/formatter.py: Refactored format_ooc_only() to delegate (11 lines, was 33) Benefits: - Eliminates ~58 lines of duplicate code - Strategy pattern allows easy extension (e.g., MIXED_ONLY filter) - Type-safe filtering with enums instead of magic strings - No breaking changes - all existing code continues to work Part of: Refactor #5 - Consolidate Formatter Methods Related: Refactor #7 - Replace Magic Strings with Enums
* feat: Create constants module with type-safe enums - Add Classification enum (IC, OOC, MIXED) with display names - Add ProcessingStatus enum with terminal status checks - Add PipelineStage enum with stage numbers and display names - Add OutputFormat enum with file extension helpers - Add SpeakerLabel class for speaker ID constants - Add ConfidenceDefaults class with value clamping - Add TimeConstants class with time conversion utilities - Add comprehensive test suite with 100% coverage (38 tests) - All enums inherit from str for backward compatibility Part of Refactor #7: Replace Magic Strings with Enums * refactor: Update classifier and formatter to use enums Classifier updates (classifier.py): - Update ClassificationResult to use Classification enum - Update to_dict/from_dict for enum serialization - Replace magic strings with Classification enum in both classifiers - Replace confidence defaults with ConfidenceDefaults.DEFAULT - Replace confidence clamping with ConfidenceDefaults.clamp() - Add error handling for invalid classification strings - Update tests to use Classification enum Formatter updates (formatter.py): - Use Classification enum for all IC/OOC comparisons - Use OutputFormat enum for return dictionary keys in save_all_formats() - Serialize classification.value in JSON output - Update StatisticsGenerator to use Classification enum All existing tests pass with backward compatibility maintained Part of Refactor #7: Replace Magic Strings with Enums * refactor: Update pipeline.py to use PipelineStage enum - Add PipelineStage and ProcessingStatus imports - Replace all 71 occurrences of stage strings with PipelineStage enum: - "audio_converted" → PipelineStage.AUDIO_CONVERTED - "audio_chunked" → PipelineStage.AUDIO_CHUNKED - "audio_transcribed" → PipelineStage.AUDIO_TRANSCRIBED - "transcription_merged" → PipelineStage.TRANSCRIPTION_MERGED - "speaker_diarized" → PipelineStage.SPEAKER_DIARIZED - "segments_classified" → PipelineStage.SEGMENTS_CLASSIFIED - "outputs_generated" → PipelineStage.OUTPUTS_GENERATED - "audio_segments_exported" → PipelineStage.AUDIO_SEGMENTS_EXPORTED - "knowledge_extracted" → PipelineStage.KNOWLEDGE_EXTRACTED - Backward compatibility maintained (enums inherit from str) - Checkpoint system works without changes (accepts str type) Part of Refactor #7: Replace Magic Strings with Enums * refactor: Update diarizer.py to use SpeakerLabel constant - Add SpeakerLabel import - Replace UNKNOWN string with SpeakerLabel.UNKNOWN - Improves type safety and consistency Part of Refactor #7: Replace Magic Strings with Enums * docs: Add comprehensive migration guide for enum refactoring - Document all enum usage patterns - Provide before/after examples - Explain backward compatibility - List all updated files - Include troubleshooting section Part of Refactor #7: Replace Magic Strings with Enums * fix: Critical bug fixes and ProcessingStatus enum usage Critical P1 Fixes: - Fix classification fallback crash: Use Classification.IN_CHARACTER enum instead of string "IC" in error/skip paths (would crash on to_dict()) - Use ConfidenceDefaults.DEFAULT instead of hardcoded 0.5 High Priority: - Add ProcessingStatus enum to 24 StatusTracker.update_stage() calls - Add Classification and ConfidenceDefaults imports to pipeline.py Code Quality: - Simplify unnecessary .value calls in formatter.py (enums inherit from str, so .value is optional in most contexts) Note: Kept __str__ methods in enums - they ARE necessary to ensure str(enum) returns the value (e.g., 'IC') not the name (e.g., 'Classification.IN_CHARACTER') for proper logging and serialization. Addresses code review feedback from gemini-code-assist and chatgpt-codex --------- Co-authored-by: Claude <noreply@anthropic.com>
Resolved conflicts by integrating changes from Refactor #7 (Magic Strings to Enums): - src/constants.py: Added TranscriptFilter enum to comprehensive constants from main - src/formatter.py: Kept format_filtered() refactoring while adopting Classification and OutputFormat enums from main Changes integrated from main: - Classification enum with display_name property - ProcessingStatus, PipelineStage, OutputFormat enums - SpeakerLabel, ConfidenceDefaults, TimeConstants classes - Updated formatter to use Classification enum throughout - Rate limiting for Groq API - Diarization improvements Maintained from this branch: - format_filtered() method with Strategy pattern - TranscriptFilter enum for filtering options - Backward-compatible format_ic_only/ooc_only delegation - Comprehensive test suite for format_filtered() No conflicts remain. All tests should pass.
* feat(formatter): add format_filtered with Strategy pattern - Add Classification and TranscriptFilter enums to src/constants.py - Implement format_filtered() method with flexible filtering - Update format_ic_only/ooc_only to delegate to new method - Maintain 100% backward compatibility with existing code Changes: - src/constants.py: New file with Classification and TranscriptFilter enums - src/formatter.py: Added format_filtered() method (80 lines) - src/formatter.py: Refactored format_ic_only() to delegate (11 lines, was 36) - src/formatter.py: Refactored format_ooc_only() to delegate (11 lines, was 33) Benefits: - Eliminates ~58 lines of duplicate code - Strategy pattern allows easy extension (e.g., MIXED_ONLY filter) - Type-safe filtering with enums instead of magic strings - No breaking changes - all existing code continues to work Part of: Refactor #5 - Consolidate Formatter Methods Related: Refactor #7 - Replace Magic Strings with Enums * test(formatter): add comprehensive tests for format_filtered Add 14 comprehensive test cases covering: - All filter types (ALL, IC_ONLY, OOC_ONLY, MIXED_ONLY) - Edge cases (empty transcript, invalid filter type) - Speaker profile mapping and character name usage - Backward compatibility with format_ic_only/ooc_only - Timestamp formatting - Default parameter behavior Tests ensure: - New method works correctly for all scenarios - Old methods produce identical output to new method - Type safety with enum validation - Proper error handling Coverage: 100% of format_filtered() logic paths Part of: Refactor #5 - Consolidate Formatter Methods * fix(formatter): restore backward compatibility for MIXED segments Critical fix addressing code review feedback from gemini-code-assist and chatgpt-codex-connector bots. Problem: The original format_ic_only() used `if classification == "OOC": continue`, which included BOTH IC and MIXED segments (excluded only OOC). The new format_filtered() used `if classification != IC: continue`, which EXCLUDED both OOC and MIXED segments, breaking backward compatibility. Solution: - Changed IC_ONLY filter to exclude only OOC (keeps IC and MIXED) - Changed OOC_ONLY filter to exclude only IC (keeps OOC and MIXED) - MIXED_ONLY filter continues to include only MIXED segments - Added detailed comments explaining the filtering logic Additional fixes: - Changed TranscriptFilter.ALL header from "FULL VERSION" to "FILTERED VERSION" to avoid confusion with format_full_transcript() output - Updated test fixtures to include MIXED classification - Updated test expectations to verify MIXED segments appear in both IC and OOC outputs - Added backward compatibility verification comments This ensures existing code that calls format_ic_only()/format_ooc_only() continues to work exactly as before, with MIXED content appearing in both generated transcripts. Fixes: Code review issues P1 (breaking change) and High (header confusion) --------- Co-authored-by: Claude <noreply@anthropic.com>
Consolidated duplicate _parse_response() and _build_prompt() methods from OllamaClassifier and GroqClassifier into the BaseClassifier, eliminating ~65 lines of duplicate code. Changes: - Moved _build_prompt() to BaseClassifier with comprehensive docstring - Handles prompt template formatting with character/player context - Subclasses can override for custom prompt building - Moved _parse_response() to BaseClassifier with comprehensive docstring - Parses Dutch-format LLM responses (Classificatie/Reden/Vertrouwen/Personage) - Handles Classification enum, confidence clamping, and error cases - Subclasses can override for different response formats - Added preflight_check() to GroqClassifier - Validates API key configuration - Tests API connectivity - Matches OllamaClassifier pattern Benefits: - DRY principle: Single implementation instead of duplicate code - Maintainability: Bug fixes and improvements apply to all classifiers - Extensibility: New classifiers can inherit default implementations - Type safety: Uses Classification enum from constants module Testing: - All 37 existing tests pass - Code coverage: 78% - No breaking changes - backward compatible Dependencies: - Uses Classification enum from src/constants.py (Refactor #7) Refs: Refactor #2+#3 - Consolidate Classifier Methods
* refactor(classifier): consolidate duplicate methods in BaseClassifier Consolidated duplicate _parse_response() and _build_prompt() methods from OllamaClassifier and GroqClassifier into the BaseClassifier, eliminating ~65 lines of duplicate code. Changes: - Moved _build_prompt() to BaseClassifier with comprehensive docstring - Handles prompt template formatting with character/player context - Subclasses can override for custom prompt building - Moved _parse_response() to BaseClassifier with comprehensive docstring - Parses Dutch-format LLM responses (Classificatie/Reden/Vertrouwen/Personage) - Handles Classification enum, confidence clamping, and error cases - Subclasses can override for different response formats - Added preflight_check() to GroqClassifier - Validates API key configuration - Tests API connectivity - Matches OllamaClassifier pattern Benefits: - DRY principle: Single implementation instead of duplicate code - Maintainability: Bug fixes and improvements apply to all classifiers - Extensibility: New classifiers can inherit default implementations - Type safety: Uses Classification enum from constants module Testing: - All 37 existing tests pass - Code coverage: 78% - No breaking changes - backward compatible Dependencies: - Uses Classification enum from src/constants.py (Refactor #7) Refs: Refactor #2+#3 - Consolidate Classifier Methods * fix(classifier): improve response parsing robustness and logging Addressed code review feedback from gemini-code-assist: 1. Fixed high-severity multi-line parsing issue: - Replaced brittle line-by-line parsing with regex-based extraction - Used re.DOTALL flag for reasoning field to handle multi-line values - Prevents corruption when field keywords appear in multi-line text - Handles out-of-order fields gracefully 2. Fixed medium-severity silent error handling: - Added warning log for invalid confidence values - Now consistent with classification error handling - Improves debuggability by surfacing parsing issues Changes: - _parse_response() now uses regex patterns for each field - Reden (reasoning) captures everything until next field - More resilient to LLM output variations - Better error visibility for debugging Testing: - All 37 tests still passing - Backward compatible with existing test cases Co-authored-by: gemini-code-assist[bot] --------- Co-authored-by: Claude <noreply@anthropic.com>
No description provided.