Context
GitService.parseDiff(_:), the log parser, and the blame / status parsers all use String.components(separatedBy:), which copies every line into a new String. That cost grows linearly with history size (large git log runs) and diff size (large changesets), and shows up in instruments as a lot of small heap allocations.
Suggested by feedback on r/swift.
Proposal
Two-step refactor:
-
Switch to Substring views by replacing components(separatedBy:) with split(separator:). Substring is a non-owning view over the original String, so no copy. Most of the parsers only need slices for matching prefixes (@@, +, -, ...).
-
Use Scanner (or a hand-rolled state machine over UnicodeScalarView) for the parsers that walk character by character — particularly the hunk-header parser (@@ -oldStart,oldCount +newStart,newCount @@) and the blame porcelain header. Scanner fits well here because the format is line-anchored and structured.
Files in scope
Services/GitService.swift — parseDiff, parseBlame, log, status, branches, parseBranchFromRefs
Services/ConflictParser.swift — currently uses components(separatedBy:) over the whole file content
Acceptance criteria
- No behavioural change in the parsed output (existing call sites keep working)
- Measurable reduction in allocation count when parsing a
git log of 200+ commits or a diff over a few hundred KB (Instruments → Allocations template)
- Build still passes, SwiftLint still clean
Context
GitService.parseDiff(_:), thelogparser, and the blame / status parsers all useString.components(separatedBy:), which copies every line into a newString. That cost grows linearly with history size (largegit logruns) and diff size (large changesets), and shows up in instruments as a lot of small heap allocations.Suggested by feedback on r/swift.
Proposal
Two-step refactor:
Switch to
Substringviews by replacingcomponents(separatedBy:)withsplit(separator:).Substringis a non-owning view over the originalString, so no copy. Most of the parsers only need slices for matching prefixes (@@,+,-, ...).Use
Scanner(or a hand-rolled state machine overUnicodeScalarView) for the parsers that walk character by character — particularly the hunk-header parser (@@ -oldStart,oldCount +newStart,newCount @@) and the blame porcelain header.Scannerfits well here because the format is line-anchored and structured.Files in scope
Services/GitService.swift—parseDiff,parseBlame,log,status,branches,parseBranchFromRefsServices/ConflictParser.swift— currently usescomponents(separatedBy:)over the whole file contentAcceptance criteria
git logof 200+ commits or a diff over a few hundred KB (Instruments → Allocations template)