Skip to content

Conversation

@yujonglee
Copy link
Contributor

No description provided.

@yujonglee yujonglee merged commit 4cfba7f into main Nov 14, 2025
3 of 4 checks passed
@yujonglee yujonglee deleted the final-segmenting-again branch November 14, 2025 11:24
@coderabbitai
Copy link

coderabbitai bot commented Nov 14, 2025

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

The segment pipeline is refactored from a fixed, linear composition to a data-driven, stage-based architecture. A new generic SegmentPass<T> type enforces required input/output keys at each stage. The hard-coded pipeline is replaced with a SEGMENT_PIPELINE configuration, and pass implementations are reorganized with explicit dependencies and validation. The merge-segments pass is removed, and identity resolution shifts to a rule-based model.

Changes

Cohort / File(s) Summary
Core pipeline orchestration
apps/desktop/src/utils/segment/index.ts
Introduces SEGMENT_PIPELINE configuration, generic GraphWithKey utilities, and runSegmentPipeline function to iterate through stages with validated input/output keys. Replaces hard-coded pipeline composition with data-driven approach; refactors buildSegments to initialize and process graphs through the pipeline.
Segmentation pass refactor
apps/desktop/src/utils/segment/pass-build-segments.ts
Replaces frame iteration with functional collectSegments reducer and multi-state model (ChannelSegmentsState, SegmentationReducerState). Introduces SegmentKeyUtils for centralized key handling and refactors canExtend logic. Updates public export to SegmentPass<"frames">.
Identity resolution refactor
apps/desktop/src/utils/segment/pass-resolve-speakers.ts
Replaces monolithic resolveSpeakerIdentity with rule-based pipeline (IdentityRule, applyIdentityRules). Introduces SpeakerStateSnapshot for state capture and four rule implementations (explicit assignment, speaker-index lookup, channel lookup, carry-forward). Removes provenance tracking; exports resolveIdentitiesPass: SegmentPass<"words">.
Normalization pass refactor
apps/desktop/src/utils/segment/pass-normalize-words.ts
Internalizes normalizeWords function and introduces toSegmentWord helper. Maintains public normalizeWordsPass export with unchanged behavior; refactors implementation to use new private helpers.
Identity propagation refactor
apps/desktop/src/utils/segment/pass-propagate-identity.ts
Replaces procedural propagateCompleteChannelIdentities function with exported identityPropagationPass: SegmentPass<"segments">. Introduces postProcessSegments helper and assignCompleteChannelHumanId for per-segment processing with in-place compaction.
Shared types and utilities
apps/desktop/src/utils/segment/shared.ts
Adds SegmentKey utility methods (hasSpeakerIdentity, equals, serialize). Refactors SegmentPass to generic form with required-key enforcement via RequireKeys helper. Removes "merge_segments" from StageId; removes IdentityProvenance and SpeakerIdentityResolution exports.
Pass removal
apps/desktop/src/utils/segment/pass-merge-segments.ts
Entire file removed; mergeSegmentsPass and associated helpers (hasSpeakerIdentity, sameKey, canMergeSegments, mergeAdjacentSegments) eliminated from pipeline.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant buildSegments
    participant runSegmentPipeline
    participant Stage as Stage Pipeline<br/>(SEGMENT_PIPELINE)
    participant SharedTypes as Graph & Keys
    
    Caller->>buildSegments: call buildSegments(ctx)
    buildSegments->>buildSegments: create initialGraph
    buildSegments->>runSegmentPipeline: runSegmentPipeline(initialGraph)
    
    loop For each stage in SEGMENT_PIPELINE
        runSegmentPipeline->>Stage: validate input keys present
        rect rgba(200, 220, 255, 0.3)
            note over Stage: Stage: normalizeWordsPass<br/>Input: needs "frames"<br/>Output: ensures "words"
        end
        Stage->>SharedTypes: run pass.run(graph, ctx)
        SharedTypes-->>Stage: validate output key
        Stage-->>runSegmentPipeline: graph with new key
        
        rect rgba(200, 220, 255, 0.3)
            note over Stage: Stage: resolveIdentitiesPass<br/>Input: needs "words"<br/>Output: ensures "frames"
        end
        
        rect rgba(200, 220, 255, 0.3)
            note over Stage: Stage: segmentationPass<br/>Input: needs "frames"<br/>Output: ensures "segments"
        end
        
        rect rgba(200, 220, 255, 0.3)
            note over Stage: Stage: identityPropagationPass<br/>Input: needs "segments"<br/>Output: ensures final segments
        end
    end
    
    runSegmentPipeline-->>buildSegments: completed graph
    buildSegments->>buildSegments: ensureGraphHasKeys("segments")
    buildSegments->>buildSegments: finalizeSegments(graph)
    buildSegments-->>Caller: final segments result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • High-density logic changes: Multiple files contain significant refactoring beyond simple reorganization, particularly the rule-based identity resolution system and the new generic SegmentPass type system with key validation.
  • Heterogeneous changes across files: Each pass implementation follows a distinct refactoring pattern (functional reducers in build-segments, rule pipeline in resolve-speakers, helper extraction in normalize-words, wrapper conversion in propagate-identity), requiring separate reasoning for each.
  • Structural shift: Pipeline moves from procedural composition (defaultSegmentPasses) to data-driven configuration (SEGMENT_PIPELINE), affecting control flow understanding across the entire module.
  • Type system changes: New generic constraints and validation utilities (GraphWithKey, RequireKeys, ensureGraphHasKeys) require careful review for correctness of type enforcement.

Areas requiring extra attention:

  • Validation logic in runSegmentPipeline and ensureGraphHasKeys to confirm all key dependencies are correctly enforced between stages
  • Functional reducer logic in pass-build-segments.ts (collectSegments, reduceFrame) to verify segment extension conditions and state transitions
  • Rule execution order and identity resolution fallback chain in pass-resolve-speakers.ts to ensure correct precedence across applyIdentityRules
  • In-place graph mutation and compaction in pass-propagate-identity.ts and pass-build-segments.ts to confirm no data loss during transformations
  • Verification that removal of pass-merge-segments.ts and elimination of the merge stage does not break downstream consumers expecting merged segments

Possibly related PRs

  • Transcript viewer final #1603: Modifies transcript viewer to use segmentation utilities (buildSegments, groupIntoTurns, splitIntoSegments) that are refactored in this PR's core pipeline.
  • Fix health-checks in the setting #1663: Refactors segment pipeline types and pass implementations (normalizeWords, resolveIdentities, propagateIdentity, mergeSegments), directly overlapping with this PR's structural reorganization of the segment module.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch final-segmenting-again

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 150b56a and 6eff061.

📒 Files selected for processing (7)
  • apps/desktop/src/utils/segment/index.ts (3 hunks)
  • apps/desktop/src/utils/segment/pass-build-segments.ts (3 hunks)
  • apps/desktop/src/utils/segment/pass-merge-segments.ts (0 hunks)
  • apps/desktop/src/utils/segment/pass-normalize-words.ts (1 hunks)
  • apps/desktop/src/utils/segment/pass-propagate-identity.ts (1 hunks)
  • apps/desktop/src/utils/segment/pass-resolve-speakers.ts (2 hunks)
  • apps/desktop/src/utils/segment/shared.ts (3 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants