Skip to content

[Refactor] Extract diff parsing into src/diff module #66

@oleander

Description

@oleander

Context

src/hook.rs is 725 lines and mixes multiple concerns. This extracts diff parsing into a dedicated module as the first step in breaking it apart.

Priority

🟡 HIGH - Reduces large file, improves organization

Steps

1. Create module structure

mkdir -p src/diff
touch src/diff/mod.rs
touch src/diff/parser.rs
touch src/diff/traits.rs

2. Create src/diff/traits.rs

Extract utility traits from hook.rs:

//! Utility traits for diff processing.

use std::path::PathBuf;
use anyhow::Result;

/// Extension trait for PathBuf to support file operations needed for commits
pub trait FilePath {
    fn is_empty(&self) -> Result<bool>;
    fn write(&self, msg: String) -> Result<()>;
    fn read(&self) -> Result<String>;
}

// Move implementations from hook.rs lines 88-109

/// Extension trait for converting bytes to UTF-8 strings
pub trait Utf8String {
    fn to_utf8(&self) -> String;
}

// Move implementations from hook.rs lines 128-152

/// Extension trait for git2::DiffDelta to get file paths
pub trait DiffDeltaPath {
    fn path(&self) -> PathBuf;
}

// Move implementations from hook.rs lines 112-125

3. Create src/diff/parser.rs

Extract diff parsing logic from multi_step_integration.rs:

//! Git diff parsing utilities.

use std::path::PathBuf;
use anyhow::Result;

/// Represents a parsed file from a git diff
#[derive(Debug, Clone)]
pub struct ParsedFile {
    pub path: String,
    pub operation: String,
    pub diff_content: String,
}

/// Parse git diff into individual file changes.
///
/// Handles various diff formats including:
/// - Standard git diff output
/// - Diffs with commit hashes
/// - Diffs with various path prefixes (a/, b/, c/, i/)
/// - Deleted files (/dev/null paths)
///
/// # Arguments
/// * `diff_content` - Raw git diff text
///
/// # Returns
/// * `Result<Vec<ParsedFile>>` - Parsed files or error
pub fn parse_diff(diff_content: &str) -> Result<Vec<ParsedFile>> {
    // Move implementation from multi_step_integration.rs lines 212-402
}

/// Extracts file path from diff header parts
fn extract_file_path_from_diff_parts(parts: &[&str]) -> Option<String> {
    // Move from multi_step_integration.rs lines 186-209
}

#[cfg(test)]
mod tests {
    // Move tests from multi_step_integration.rs lines 639-810
}

4. Create src/diff/mod.rs

//! Diff processing and parsing utilities.
//!
//! This module handles parsing git diffs into structured data
//! and provides utilities for working with diff content.

pub mod parser;
pub mod traits;

pub use parser::{ParsedFile, parse_diff};
pub use traits::{FilePath, Utf8String, DiffDeltaPath};

5. Update src/lib.rs

pub mod diff;

6. Update imports in affected files

In src/hook.rs:

// Add at top
use crate::diff::traits::{FilePath, Utf8String, DiffDeltaPath};

// Remove old trait definitions (lines 88-152)

In src/multi_step_integration.rs:

// Add at top
use crate::diff::parser::parse_diff;

// Remove parse_diff function and helpers (lines 186-402)
// Keep the call sites

7. Remove moved code from original files

Delete the code that was moved to avoid duplication.

Verification Criteria

Pass:

  • src/diff/ module exists with mod.rs, parser.rs, traits.rs
  • All diff parsing logic moved from multi_step_integration.rs
  • All utility traits moved from hook.rs
  • No duplicate code remains in original files
  • All imports updated correctly
  • cargo build succeeds
  • cargo test passes all tests
  • All existing diff parsing tests still pass
  • cargo clippy shows no warnings
  • hook.rs reduced in size (should be ~600 lines)

Test manually

# Run the diff parsing tests specifically
cargo test --test multi_step -- parse_diff

# Test with actual git repo
cd test-repo
echo "test" > test.txt
git add test.txt
# Trigger hook to test parsing
git commit --no-edit

Estimated Time

4-5 hours

Dependencies

Labels

  • refactor
  • module-structure
  • diff-processing

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions