Skip to content

Add commit description parsing to git backends#46

Merged
tbrittain merged 1 commit intomainfrom
claude/index-commit-descriptions-qVnj2
Mar 5, 2026
Merged

Add commit description parsing to git backends#46
tbrittain merged 1 commit intomainfrom
claude/index-commit-descriptions-qVnj2

Conversation

@tbrittain
Copy link
Copy Markdown
Owner

Summary

This PR adds support for parsing and storing commit message descriptions (body text) in addition to subject lines across both git backends (go-git and native git). The description is now extracted, stored in the database, and available for analysis.

Key Changes

  • Git model: Extended Commit struct with a Description field to store the commit message body (text after the first blank line)
  • go-git backend: Updated to split commit messages on the first double newline, extracting subject and description separately
  • Native git backend: Modified git log format to include the body (%b) and added an end marker (GITANALYTICS_ENDMETA) for reliable parsing of multi-line descriptions
  • Database schema: Added description column to the commits table with a migration in Init() to support existing databases
  • Database insertion: Updated the insert statement to include the description field
  • Tests: Added comprehensive test coverage (TestGoGitDescription and TestNativeDescription) with a shared helper function (initTestRepoWithDesc) that creates test repositories with both subject-only and subject+description commits

Implementation Details

  • The description is extracted by splitting on the first occurrence of \n\n (blank line separator)
  • Descriptions are trimmed of leading/trailing whitespace
  • The native backend uses GITANALYTICS_ENDMETA as a sentinel to reliably detect the end of the description body in the git log output
  • Database migration gracefully handles existing databases by ignoring the error when the column already exists

https://claude.ai/code/session_01HJ7FLDkYneKyCiTtGHoRQ8

Commit messages consist of a subject (first line) and an optional body
(everything after the first blank line). Previously only the subject was
captured. This change stores the body in a dedicated `description` column
so it is available alongside the subject for downstream use.

Changes:
- Add `Description` field to the `Commit` struct
- Native git parser: extend format string with `%b%nGITANALYTICS_ENDMETA`
  and collect body lines until the end marker
- Go-git parser: split `c.Message` on `\n\n` to separate subject from body
- Schema: add `description TEXT NOT NULL DEFAULT ''` to commits table DDL
- Store Init(): run `ALTER TABLE commits ADD COLUMN description ...` to
  migrate existing databases (no-op if the column already exists)
- InsertCommits(): include description in the INSERT statement
- Tests: add `initTestRepoWithDesc` helper and description assertion tests
  for both the native and go-git implementations

https://claude.ai/code/session_01HJ7FLDkYneKyCiTtGHoRQ8
@tbrittain tbrittain marked this pull request as ready for review March 5, 2026 04:35
@tbrittain
Copy link
Copy Markdown
Owner Author

image

@tbrittain tbrittain merged commit da433a0 into main Mar 5, 2026
2 checks passed
@tbrittain tbrittain deleted the claude/index-commit-descriptions-qVnj2 branch March 5, 2026 04:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants