Skip to content

Claude/filter sessions by campaign 011 c us q5 a zug27rakz a pmj tv#7

Merged
Gambitnl merged 4 commits intomainfrom
claude/filter-sessions-by-campaign-011CUsQ5AZug27rakzAPmjTV
Nov 6, 2025
Merged

Claude/filter sessions by campaign 011 c us q5 a zug27rakz a pmj tv#7
Gambitnl merged 4 commits intomainfrom
claude/filter-sessions-by-campaign-011CUsQ5AZug27rakzAPmjTV

Conversation

@Gambitnl
Copy link
Copy Markdown
Owner

@Gambitnl Gambitnl commented Nov 6, 2025

No description provided.

- Clean up stale P3-FEATURE-001 lock (no agent/timestamp)
- Add conflict risk ratings (LOW/MEDIUM/HIGH) for all UI bugs
- Add effort estimates for better task planning
- Group bugs by file/module for easier conflict detection
- Add Multi-Session Coordination Guide with examples
- Reorganize high/medium/low priority bugs by conflict risk

This enables multiple agents to work in parallel on same branch by:
- Clearly identifying which files each bug touches
- Rating conflict risk for safe task selection
- Providing coordination workflow and examples
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @Gambitnl, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a critical enhancement to the campaign dashboard by enabling the filtering of processed sessions based on their associated campaign ID. This ensures that users only see sessions relevant to the currently active campaign, significantly improving data accuracy and user experience. Concurrently, the project's task management documentation has been thoroughly revised to provide clearer guidance for multi-session development and detailed breakdowns of outstanding issues, reflecting the completion of the session filtering task.

Highlights

  • Campaign Session Filtering: Implemented a new feature that filters processed sessions displayed in the campaign dashboard, ensuring only sessions belonging to the currently selected campaign are shown. This resolves BUG-20251103-017.
  • Task Management Documentation Update: The OUTSTANDING_TASKS.md document has been significantly updated to include a comprehensive 'Multi-Session Coordination Guide' and a more detailed, categorized list of bugs with estimated effort and conflict risk levels.
  • Task Status Updates: The P3-FEATURE-001: Real-time Processing task has been marked as available, and BUG-20251103-017 (sessions not filtered by campaign) has been marked as completed by 'Claude'.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the task tracking documentation and implements a bug fix to filter processed sessions by campaign on the campaign dashboard. The code change correctly filters sessions by reading metadata from _data.json files. My review includes one suggestion to improve robustness by handling cases where the JSON metadata file does not contain a dictionary at its root, which could otherwise lead to an unhandled exception.

Comment thread src/campaign_dashboard.py Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/campaign_dashboard.py
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@Gambitnl Gambitnl merged commit 3b680e9 into main Nov 6, 2025
@Gambitnl Gambitnl deleted the claude/filter-sessions-by-campaign-011CUsQ5AZug27rakzAPmjTV branch November 6, 2025 22:51
Gambitnl pushed a commit that referenced this pull request Nov 12, 2025
- Add Classification enum (IC, OOC, MIXED) with display names
- Add ProcessingStatus enum with terminal status checks
- Add PipelineStage enum with stage numbers and display names
- Add OutputFormat enum with file extension helpers
- Add SpeakerLabel class for speaker ID constants
- Add ConfidenceDefaults class with value clamping
- Add TimeConstants class with time conversion utilities
- Add comprehensive test suite with 100% coverage (38 tests)
- All enums inherit from str for backward compatibility

Part of Refactor #7: Replace Magic Strings with Enums
Gambitnl pushed a commit that referenced this pull request Nov 12, 2025
Classifier updates (classifier.py):
- Update ClassificationResult to use Classification enum
- Update to_dict/from_dict for enum serialization
- Replace magic strings with Classification enum in both classifiers
- Replace confidence defaults with ConfidenceDefaults.DEFAULT
- Replace confidence clamping with ConfidenceDefaults.clamp()
- Add error handling for invalid classification strings
- Update tests to use Classification enum

Formatter updates (formatter.py):
- Use Classification enum for all IC/OOC comparisons
- Use OutputFormat enum for return dictionary keys in save_all_formats()
- Serialize classification.value in JSON output
- Update StatisticsGenerator to use Classification enum

All existing tests pass with backward compatibility maintained

Part of Refactor #7: Replace Magic Strings with Enums
Gambitnl pushed a commit that referenced this pull request Nov 12, 2025
- Add PipelineStage and ProcessingStatus imports
- Replace all 71 occurrences of stage strings with PipelineStage enum:
  - "audio_converted" → PipelineStage.AUDIO_CONVERTED
  - "audio_chunked" → PipelineStage.AUDIO_CHUNKED
  - "audio_transcribed" → PipelineStage.AUDIO_TRANSCRIBED
  - "transcription_merged" → PipelineStage.TRANSCRIPTION_MERGED
  - "speaker_diarized" → PipelineStage.SPEAKER_DIARIZED
  - "segments_classified" → PipelineStage.SEGMENTS_CLASSIFIED
  - "outputs_generated" → PipelineStage.OUTPUTS_GENERATED
  - "audio_segments_exported" → PipelineStage.AUDIO_SEGMENTS_EXPORTED
  - "knowledge_extracted" → PipelineStage.KNOWLEDGE_EXTRACTED

- Backward compatibility maintained (enums inherit from str)
- Checkpoint system works without changes (accepts str type)

Part of Refactor #7: Replace Magic Strings with Enums
Gambitnl pushed a commit that referenced this pull request Nov 12, 2025
- Add SpeakerLabel import
- Replace UNKNOWN string with SpeakerLabel.UNKNOWN
- Improves type safety and consistency

Part of Refactor #7: Replace Magic Strings with Enums
Gambitnl pushed a commit that referenced this pull request Nov 12, 2025
- Document all enum usage patterns
- Provide before/after examples
- Explain backward compatibility
- List all updated files
- Include troubleshooting section

Part of Refactor #7: Replace Magic Strings with Enums
Gambitnl pushed a commit that referenced this pull request Nov 12, 2025
- Add Classification and TranscriptFilter enums to src/constants.py
- Implement format_filtered() method with flexible filtering
- Update format_ic_only/ooc_only to delegate to new method
- Maintain 100% backward compatibility with existing code

Changes:
- src/constants.py: New file with Classification and TranscriptFilter enums
- src/formatter.py: Added format_filtered() method (80 lines)
- src/formatter.py: Refactored format_ic_only() to delegate (11 lines, was 36)
- src/formatter.py: Refactored format_ooc_only() to delegate (11 lines, was 33)

Benefits:
- Eliminates ~58 lines of duplicate code
- Strategy pattern allows easy extension (e.g., MIXED_ONLY filter)
- Type-safe filtering with enums instead of magic strings
- No breaking changes - all existing code continues to work

Part of: Refactor #5 - Consolidate Formatter Methods
Related: Refactor #7 - Replace Magic Strings with Enums
Gambitnl added a commit that referenced this pull request Nov 12, 2025
* feat: Create constants module with type-safe enums

- Add Classification enum (IC, OOC, MIXED) with display names
- Add ProcessingStatus enum with terminal status checks
- Add PipelineStage enum with stage numbers and display names
- Add OutputFormat enum with file extension helpers
- Add SpeakerLabel class for speaker ID constants
- Add ConfidenceDefaults class with value clamping
- Add TimeConstants class with time conversion utilities
- Add comprehensive test suite with 100% coverage (38 tests)
- All enums inherit from str for backward compatibility

Part of Refactor #7: Replace Magic Strings with Enums

* refactor: Update classifier and formatter to use enums

Classifier updates (classifier.py):
- Update ClassificationResult to use Classification enum
- Update to_dict/from_dict for enum serialization
- Replace magic strings with Classification enum in both classifiers
- Replace confidence defaults with ConfidenceDefaults.DEFAULT
- Replace confidence clamping with ConfidenceDefaults.clamp()
- Add error handling for invalid classification strings
- Update tests to use Classification enum

Formatter updates (formatter.py):
- Use Classification enum for all IC/OOC comparisons
- Use OutputFormat enum for return dictionary keys in save_all_formats()
- Serialize classification.value in JSON output
- Update StatisticsGenerator to use Classification enum

All existing tests pass with backward compatibility maintained

Part of Refactor #7: Replace Magic Strings with Enums

* refactor: Update pipeline.py to use PipelineStage enum

- Add PipelineStage and ProcessingStatus imports
- Replace all 71 occurrences of stage strings with PipelineStage enum:
  - "audio_converted" → PipelineStage.AUDIO_CONVERTED
  - "audio_chunked" → PipelineStage.AUDIO_CHUNKED
  - "audio_transcribed" → PipelineStage.AUDIO_TRANSCRIBED
  - "transcription_merged" → PipelineStage.TRANSCRIPTION_MERGED
  - "speaker_diarized" → PipelineStage.SPEAKER_DIARIZED
  - "segments_classified" → PipelineStage.SEGMENTS_CLASSIFIED
  - "outputs_generated" → PipelineStage.OUTPUTS_GENERATED
  - "audio_segments_exported" → PipelineStage.AUDIO_SEGMENTS_EXPORTED
  - "knowledge_extracted" → PipelineStage.KNOWLEDGE_EXTRACTED

- Backward compatibility maintained (enums inherit from str)
- Checkpoint system works without changes (accepts str type)

Part of Refactor #7: Replace Magic Strings with Enums

* refactor: Update diarizer.py to use SpeakerLabel constant

- Add SpeakerLabel import
- Replace UNKNOWN string with SpeakerLabel.UNKNOWN
- Improves type safety and consistency

Part of Refactor #7: Replace Magic Strings with Enums

* docs: Add comprehensive migration guide for enum refactoring

- Document all enum usage patterns
- Provide before/after examples
- Explain backward compatibility
- List all updated files
- Include troubleshooting section

Part of Refactor #7: Replace Magic Strings with Enums

* fix: Critical bug fixes and ProcessingStatus enum usage

Critical P1 Fixes:
- Fix classification fallback crash: Use Classification.IN_CHARACTER enum
  instead of string "IC" in error/skip paths (would crash on to_dict())
- Use ConfidenceDefaults.DEFAULT instead of hardcoded 0.5

High Priority:
- Add ProcessingStatus enum to 24 StatusTracker.update_stage() calls
- Add Classification and ConfidenceDefaults imports to pipeline.py

Code Quality:
- Simplify unnecessary .value calls in formatter.py
  (enums inherit from str, so .value is optional in most contexts)

Note: Kept __str__ methods in enums - they ARE necessary to ensure
str(enum) returns the value (e.g., 'IC') not the name (e.g.,
'Classification.IN_CHARACTER') for proper logging and serialization.

Addresses code review feedback from gemini-code-assist and chatgpt-codex

---------

Co-authored-by: Claude <noreply@anthropic.com>
Gambitnl pushed a commit that referenced this pull request Nov 12, 2025
Resolved conflicts by integrating changes from Refactor #7 (Magic Strings to Enums):
- src/constants.py: Added TranscriptFilter enum to comprehensive constants from main
- src/formatter.py: Kept format_filtered() refactoring while adopting Classification and OutputFormat enums from main

Changes integrated from main:
- Classification enum with display_name property
- ProcessingStatus, PipelineStage, OutputFormat enums
- SpeakerLabel, ConfidenceDefaults, TimeConstants classes
- Updated formatter to use Classification enum throughout
- Rate limiting for Groq API
- Diarization improvements

Maintained from this branch:
- format_filtered() method with Strategy pattern
- TranscriptFilter enum for filtering options
- Backward-compatible format_ic_only/ooc_only delegation
- Comprehensive test suite for format_filtered()

No conflicts remain. All tests should pass.
Gambitnl added a commit that referenced this pull request Nov 12, 2025
* feat(formatter): add format_filtered with Strategy pattern

- Add Classification and TranscriptFilter enums to src/constants.py
- Implement format_filtered() method with flexible filtering
- Update format_ic_only/ooc_only to delegate to new method
- Maintain 100% backward compatibility with existing code

Changes:
- src/constants.py: New file with Classification and TranscriptFilter enums
- src/formatter.py: Added format_filtered() method (80 lines)
- src/formatter.py: Refactored format_ic_only() to delegate (11 lines, was 36)
- src/formatter.py: Refactored format_ooc_only() to delegate (11 lines, was 33)

Benefits:
- Eliminates ~58 lines of duplicate code
- Strategy pattern allows easy extension (e.g., MIXED_ONLY filter)
- Type-safe filtering with enums instead of magic strings
- No breaking changes - all existing code continues to work

Part of: Refactor #5 - Consolidate Formatter Methods
Related: Refactor #7 - Replace Magic Strings with Enums

* test(formatter): add comprehensive tests for format_filtered

Add 14 comprehensive test cases covering:
- All filter types (ALL, IC_ONLY, OOC_ONLY, MIXED_ONLY)
- Edge cases (empty transcript, invalid filter type)
- Speaker profile mapping and character name usage
- Backward compatibility with format_ic_only/ooc_only
- Timestamp formatting
- Default parameter behavior

Tests ensure:
- New method works correctly for all scenarios
- Old methods produce identical output to new method
- Type safety with enum validation
- Proper error handling

Coverage: 100% of format_filtered() logic paths

Part of: Refactor #5 - Consolidate Formatter Methods

* fix(formatter): restore backward compatibility for MIXED segments

Critical fix addressing code review feedback from gemini-code-assist
and chatgpt-codex-connector bots.

Problem:
The original format_ic_only() used `if classification == "OOC": continue`,
which included BOTH IC and MIXED segments (excluded only OOC).
The new format_filtered() used `if classification != IC: continue`, which
EXCLUDED both OOC and MIXED segments, breaking backward compatibility.

Solution:
- Changed IC_ONLY filter to exclude only OOC (keeps IC and MIXED)
- Changed OOC_ONLY filter to exclude only IC (keeps OOC and MIXED)
- MIXED_ONLY filter continues to include only MIXED segments
- Added detailed comments explaining the filtering logic

Additional fixes:
- Changed TranscriptFilter.ALL header from "FULL VERSION" to "FILTERED VERSION"
  to avoid confusion with format_full_transcript() output
- Updated test fixtures to include MIXED classification
- Updated test expectations to verify MIXED segments appear in both IC and OOC outputs
- Added backward compatibility verification comments

This ensures existing code that calls format_ic_only()/format_ooc_only()
continues to work exactly as before, with MIXED content appearing in both
generated transcripts.

Fixes: Code review issues P1 (breaking change) and High (header confusion)

---------

Co-authored-by: Claude <noreply@anthropic.com>
Gambitnl pushed a commit that referenced this pull request Nov 12, 2025
Consolidated duplicate _parse_response() and _build_prompt() methods
from OllamaClassifier and GroqClassifier into the BaseClassifier,
eliminating ~65 lines of duplicate code.

Changes:
- Moved _build_prompt() to BaseClassifier with comprehensive docstring
  - Handles prompt template formatting with character/player context
  - Subclasses can override for custom prompt building

- Moved _parse_response() to BaseClassifier with comprehensive docstring
  - Parses Dutch-format LLM responses (Classificatie/Reden/Vertrouwen/Personage)
  - Handles Classification enum, confidence clamping, and error cases
  - Subclasses can override for different response formats

- Added preflight_check() to GroqClassifier
  - Validates API key configuration
  - Tests API connectivity
  - Matches OllamaClassifier pattern

Benefits:
- DRY principle: Single implementation instead of duplicate code
- Maintainability: Bug fixes and improvements apply to all classifiers
- Extensibility: New classifiers can inherit default implementations
- Type safety: Uses Classification enum from constants module

Testing:
- All 37 existing tests pass
- Code coverage: 78%
- No breaking changes - backward compatible

Dependencies:
- Uses Classification enum from src/constants.py (Refactor #7)

Refs: Refactor #2+#3 - Consolidate Classifier Methods
Gambitnl added a commit that referenced this pull request Nov 12, 2025
* refactor(classifier): consolidate duplicate methods in BaseClassifier

Consolidated duplicate _parse_response() and _build_prompt() methods
from OllamaClassifier and GroqClassifier into the BaseClassifier,
eliminating ~65 lines of duplicate code.

Changes:
- Moved _build_prompt() to BaseClassifier with comprehensive docstring
  - Handles prompt template formatting with character/player context
  - Subclasses can override for custom prompt building

- Moved _parse_response() to BaseClassifier with comprehensive docstring
  - Parses Dutch-format LLM responses (Classificatie/Reden/Vertrouwen/Personage)
  - Handles Classification enum, confidence clamping, and error cases
  - Subclasses can override for different response formats

- Added preflight_check() to GroqClassifier
  - Validates API key configuration
  - Tests API connectivity
  - Matches OllamaClassifier pattern

Benefits:
- DRY principle: Single implementation instead of duplicate code
- Maintainability: Bug fixes and improvements apply to all classifiers
- Extensibility: New classifiers can inherit default implementations
- Type safety: Uses Classification enum from constants module

Testing:
- All 37 existing tests pass
- Code coverage: 78%
- No breaking changes - backward compatible

Dependencies:
- Uses Classification enum from src/constants.py (Refactor #7)

Refs: Refactor #2+#3 - Consolidate Classifier Methods

* fix(classifier): improve response parsing robustness and logging

Addressed code review feedback from gemini-code-assist:

1. Fixed high-severity multi-line parsing issue:
   - Replaced brittle line-by-line parsing with regex-based extraction
   - Used re.DOTALL flag for reasoning field to handle multi-line values
   - Prevents corruption when field keywords appear in multi-line text
   - Handles out-of-order fields gracefully

2. Fixed medium-severity silent error handling:
   - Added warning log for invalid confidence values
   - Now consistent with classification error handling
   - Improves debuggability by surfacing parsing issues

Changes:
- _parse_response() now uses regex patterns for each field
- Reden (reasoning) captures everything until next field
- More resilient to LLM output variations
- Better error visibility for debugging

Testing:
- All 37 tests still passing
- Backward compatible with existing test cases

Co-authored-by: gemini-code-assist[bot]

---------

Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants