Skip to content

[Feature request] Add character and byte metrics to git-ai stats --json #903

@beettlle

Description

@beettlle

Problem

git-ai stats --json today reports volume primarily as line counts. Lines are easy to interpret but coarse for reporting and analytics: they don’t reflect how much text was added in each bucket (human / mixed / AI-accepted, diff totals, etc.) in a way that’s comparable across formatting styles and languages.
Downstream consumers (dashboards, internal CI metrics) would benefit from optional, additive character and byte aggregates that use the same attribution and diff semantics as existing line fields—without re-implementing intersection logic, merge behavior, range aggregation, or --ignore rules outside git-ai.

Proposed solution (high level)

Extend git-ai stats --json (single commit and ranges) with derived totals:

  • Unicode scalar counts (*_chars) and UTF-8 byte lengths (*_bytes) for the same logical buckets as today’s line stats, where applicable.
  • Computed from committed blob text and the same per-file line sets already used for line-level stats.
  • Additive JSON only (e.g. serde(default)), so older clients and older git-ai binaries keep working.
  • No change to Git Notes attestation format for v1 (no authorship/4.x requirement); whole-line semantics only (no intra-line split in storage).
    TTY output could remain line-based for v1; JSON would carry the new fields (exact UX is up to maintainers).

Questions for maintainers

  1. Is this in scope for the OSS CLI, and does the approach in the PRD (field matrix, newline/binary/UTF-8 rules, merge/range parity) match how you want the product to evolve?
  2. Default max blob size (or config) for derivation: prefer a cap, or document full-blob reads / OOM risk first?
  3. Should usegitai.com / workflow template updates ship with the feature or after?
  4. Any objection to u64 for new *_chars / *_bytes while line fields stay u32?

Contribution

If the maintainers approve the direction, I’m happy to help implement it (or collaborate with whoever picks it up). I opened this as a feature request first so scope and API shape can be agreed before code lands.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions