Strengthen fuzzy_match edge-case contracts in stringutil tests#35077
Conversation
Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>
|
🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅ |
|
🧪 Test Quality Sentinel completed test quality analysis. |
|
✅ Design Decision Gate 🏗️ completed the design decision gate check. No ADR enforcement needed: PR #35077 does not have the 'implementation' label and has only 30 new lines of code in business logic directories (threshold: 100). |
|
✅ PR Code Quality Reviewer completed the code quality review. |
There was a problem hiding this comment.
Pull request overview
This PR strengthens the contract coverage for pkg/stringutil fuzzy-matching behavior by adding targeted, table-driven tests for edge cases and ordering rules, plus a clarifying UTF-8/byte semantics case for LevenshteinDistance.
Changes:
- Adds explicit
FindClosestMatchescases fornilcandidates,maxResults == 0, and excluding candidates beyond the distance cutoff (> 3). - Adds a deterministic ordering test to lock in alphabetical tie-breaking when edit distances are equal.
- Adds a multibyte UTF-8 example to document that
LevenshteinDistancecurrently operates on bytes (not runes).
Show a summary per file
| File | Description |
|---|---|
| pkg/stringutil/fuzzy_match_test.go | Adds table-driven edge-case, ordering, and UTF-8/byte-semantics test coverage for fuzzy matching and Levenshtein distance. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 1/1 changed files
- Comments generated: 0
There was a problem hiding this comment.
All five new test cases are mathematically correct against the implementation. Clean, focused test coverage.
Warning
Firewall blocked 1 domain
The following domain was blocked by the firewall during workflow execution:
proxy.golang.org
To allow these domains, add them to the
network.allowedlist in your workflow frontmatter:
network:
allowed:
- defaults
- "proxy.golang.org"See Network Configuration for more information.
🔎 Code quality review by PR Code Quality Reviewer · sonnet46 683.5K
🧪 Test Quality Sentinel Report✅ Test Quality Score: 100/100 — Excellent
📊 Metrics & Test Classification (5 tests analyzed)
Test Classification Details
Language SupportTests analyzed:
Verdict
📖 Understanding Test ClassificationsDesign Tests (High Value) verify what the system does:
Implementation Tests (Low Value) verify how the system does it:
Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.
|
There was a problem hiding this comment.
Skills-Based Review 🧠
Applied /tdd — approving with minor suggestions. All 30 additions are test-only, all distances are correct, and the new cases meaningfully extend the contract surface.
📋 Key Themes & Highlights
Positive Highlights
- ✅
maxResults: 0andnilcandidates harden guard-rail contracts that weren't previously explicit - ✅ Alphabetical tie-breaking case locks in deterministic ordering that was only implicitly guaranteed
- ✅ UTF-8 byte-semantics comment is accurate and the inline documentation approach is exactly right for a non-obvious implementation detail
- ✅ Test names are descriptive and read well as specifications
Minor Suggestions (non-blocking)
nilvs empty redundancy (line 91) — both cases hit the samerange nilcode path; either remove or add a comment explaining the distinction- Distance cutoff comment (line 106) — an inline comment showing
"abcdefg"distance = 4 would make the threshold self-evident without requiring the reader to trace into the function - End-to-end UTF-8 test (line 141) — the byte-semantics test covers
LevenshteinDistancein isolation; a companion case inFindClosestMatcheswould document the mixed rune/byte pipeline at the public API level
🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · sonnet46 1.4M
| }, | ||
| { | ||
| name: "nil candidates returns nil", | ||
| target: "copilot", |
There was a problem hiding this comment.
[/tdd] This nil candidates case duplicates the contract already proven by "empty candidates returns nothing" — both paths hit the same range no-op.
💡 Suggestion
Either remove this case, or add a clarifying comment that distinguishes why nil is worth a separate assertion (e.g., to guard against panics from a nil check that doesn't exist today but could be added):
// nil is distinct from []string{} in Go; both must be safe to pass.
{
name: "nil candidates returns nil (not a panic)",
...
},Without that rationale the reader may wonder why both exist.
| { | ||
| name: "distance four candidate excluded", | ||
| target: "abc", | ||
| candidates: []string{"abx", "abcdefg"}, |
There was a problem hiding this comment.
[/tdd] The test correctly excludes "abcdefg" but doesn't make the distance self-evident to a reader. Adding an inline comment documenting the exact distance makes this a stronger living specification.
💡 Suggestion
{
name: "distance four candidate excluded",
target: "abc",
// "abcdefg" has distance 4 (> maxDistance of 3), so only "abx" (distance 1) is returned.
candidates: []string{"abx", "abcdefg"},
maxResults: 3,
want: []string{"abx"},
},This is especially useful because the maxDistance = 3 constant is const-local to the function body — a reader can't easily cross-reference the cutoff value.
| @@ -112,6 +140,8 @@ func TestLevenshteinDistance(t *testing.T) { | |||
| {name: "push vs pus", a: "push", b: "pus", want: 1}, | |||
| {name: "contents vs scope typo", a: "contents", b: "cntents", want: 1}, | |||
There was a problem hiding this comment.
[/tdd] Great documentation of the byte-vs-rune semantic. One follow-up worth considering: FindClosestMatches applies strings.ToLower (rune-aware) before calling LevenshteinDistance (byte-based), creating a mixed-semantics pipeline for non-ASCII input.
💡 Suggested additional test for FindClosestMatches
A companion test in TestFindClosestMatches would make this gap explicit:
{
// strings.ToLower is rune-aware; LevenshteinDistance is byte-based.
// A multibyte candidate may appear closer or farther than expected.
name: "multibyte candidate utf8 byte distance",
target: "e",
candidates: []string{"é"}, // byte distance 2 — exceeds cutoff, excluded
maxResults: 3,
want: nil,
},This is not a bug introduced by this PR — it's pre-existing — but the byte-level test only covers LevenshteinDistance in isolation. A FindClosestMatches counterpart makes the end-to-end contract explicit.
|
|
pkg/stringutil/fuzzy_match_test.gocovered core behavior but left key contract edges implicit. This update adds focused table-driven cases to lock in exclusion thresholds, nil/zero-result behavior, and deterministic ordering.FindClosestMatches: edge-case contract coverage
<= 3).nilcandidates case to ensure return isnil(no panic/regression surface).maxResults: 0case to ensure no suggestions are returned.FindClosestMatches: deterministic ordering contract
LevenshteinDistance: byte-level behavior documentation
"é"vs"e") with an inline comment to document current byte-based distance semantics.{ name: "alphabetical tie breaking for equal distances", target: "zzzz", candidates: []string{"zzzb", "zzza"}, maxResults: 2, want: []string{"zzza", "zzzb"}, }