Refactor sanitizer APIs to separate artifact identifiers from code identifiers by Copilot · Pull Request #27584 · github/gh-aw

Copilot · 2026-04-21T13:46:04Z

This issue identified ambiguous sanitizer naming across pkg/workflow and pkg/stringutil, where similarly named functions serve different domains (hyphenated artifact/user-agent IDs vs underscore-based programming identifiers). This change clarifies those boundaries with explicit API names and docs.

Workflow sanitizer naming made domain-explicit
- Renamed workflow.SanitizeIdentifier to workflow.SanitizeArtifactIdentifier.
- Updated workflow-side call sites and tests to use the new name.
- Clarified package-level and function-level docs to describe artifact/user-agent semantics.
Stringutil identifier sanitizer promoted and documented
- Exported stringutil.sanitizeIdentifierName as stringutil.SanitizeIdentifierName.
- Documented intended usage for programming-language identifiers and contrasted with workflow artifact sanitization.
- Kept existing behavior intact for SanitizeParameterName and SanitizePythonVariableName by routing them through the exported helper.
Focused test updates
- Renamed workflow sanitizer test to match the new API.
- Added dedicated tests for stringutil.SanitizeIdentifierName to lock in underscore behavior, numeric-prefix handling, and optional extra allowed characters.

// workflow domain (hyphenated artifact/user-agent ID)
userAgent := workflow.SanitizeArtifactIdentifier("Weekly v2.0")
// "weekly-v2-0"

// code identifier domain (underscore-based variable/param ID)
param := stringutil.SanitizeIdentifierName("my-param", nil)
// "my_param"

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/b76bdebf-72b0-48e3-8f17-92d521a5e130 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot

Pull request overview

Refactors sanitizer APIs to disambiguate workflow artifact/user-agent identifiers (hyphenated) from programming-language identifiers (underscore-based), with updated call sites, docs, and tests.

Changes:

Renamed workflow.SanitizeIdentifier to workflow.SanitizeArtifactIdentifier and updated workflow renderer + tests.
Exported stringutil.SanitizeIdentifierName (formerly private) and routed existing helpers through it.
Added unit tests and README documentation for the newly exported identifier sanitizer.

Show a summary per file

File	Description
pkg/workflow/strings.go	Renames and documents the workflow-domain sanitizer as `SanitizeArtifactIdentifier`.
pkg/workflow/mcp_renderer_github.go	Updates user-agent fallback to use `SanitizeArtifactIdentifier`.
pkg/workflow/codex_engine_test.go	Renames the workflow sanitizer test and updates assertions to the new API.
pkg/stringutil/sanitize.go	Exports `SanitizeIdentifierName` and reroutes existing sanitizers through it.
pkg/stringutil/sanitize_test.go	Adds dedicated tests for `SanitizeIdentifierName`.
pkg/stringutil/README.md	Documents `SanitizeIdentifierName` in the package README.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comments suppressed due to low confidence (2)

pkg/workflow/strings.go:419

There are still references to the old name SanitizeIdentifier in repository docs/comments (e.g., pkg/workflow/strings.go:135,196, pkg/workflow/README.md, and the linked scratchpad docs). Since this function is now SanitizeArtifactIdentifier, these references should be updated to avoid confusing API guidance/search results.

This issue also appears on line 448 of the same file.

// SanitizeArtifactIdentifier sanitizes a workflow name to create a safe identifier
// suitable for use as a user agent string or similar context.
//
// This is a SANITIZE function (character validity pattern). Use this when creating
// identifiers that must be purely alphanumeric with hyphens, with no special characters
// preserved. Unlike SanitizeWorkflowName which preserves dots and underscores, this

pkg/workflow/strings.go:456

The debug log messages still say "identifier" even though the exported API is now domain-specific (SanitizeArtifactIdentifier). Consider updating the log messages to mention "artifact identifier" (and using %q for the input) to make debug output unambiguous and resilient to whitespace/newlines in names.

func SanitizeArtifactIdentifier(name string) string {
	stringsLog.Printf("Sanitizing identifier: %s", name)
	result := SanitizeName(name, &SanitizeOptions{
		PreserveSpecialChars: []rune{},
		TrimHyphens:          true,
		DefaultValue:         "github-agentic-workflow",
	})
	if result != name {
		stringsLog.Printf("Sanitized identifier: %s -> %s", name, result)

Files reviewed: 6/6 changed files
Comments generated: 3

github-actions · 2026-04-21T15:06:28Z

🧪 Test Quality Sentinel Report

Test Quality Score: 90/100

✅ Excellent test quality

Metric	Value
New/modified tests analyzed	2
✅ Design tests (behavioral contracts)	2 (100%)
⚠️ Implementation tests (low value)	0 (0%)
Tests with error/edge cases	2 (100%)
Duplicate test clusters	0
Test inflation detected	⚠️ Yes (`sanitize_test.go`: 35 added vs `sanitize.go`: 12 added, ratio 2.9:1)
🚨 Coding-guideline violations	0

Test Classification Details

View All Test Classifications

Test	File	Classification	Notes
`TestSanitizeIdentifierName`	`pkg/stringutil/sanitize_test.go`	✅ Design	Table-driven, 3 cases including edge cases (numeric prefix, extra allowed chars)
`TestSanitizeArtifactIdentifier`	`pkg/workflow/codex_engine_test.go`	✅ Design	Renamed from `TestSanitizeIdentifier`; 10 cases including empty input and special-char-only edge cases

Test Detail Notes

TestSanitizeIdentifierName — New test covering the newly extracted SanitizeIdentifierName function. All three table rows exercise a distinct contract: default replacement, numeric-prefix handling (edge case), and opt-in extra-allowed characters. Verifies observable return values, not internals.

TestSanitizeArtifactIdentifier — Pure rename following the API refactor. Existing 10-row table remains intact with solid edge-case coverage: empty string, whitespace-only, special-character-only, and mixed separator inputs. No behavioral regression risk.

Flagged Items — Informational Only

⚠️ Test Inflation: `pkg/stringutil/sanitize_test.go` (informational)

35 lines added vs 12 lines in sanitize.go (ratio ≈ 2.9:1, threshold: 2:1). However, this is a table-driven test covering multiple distinct edge cases, which naturally produces verbose test code. The extra lines represent real behavioral coverage (3 separate contract-relevant scenarios), not padding. No action required.

Language Support

Tests analyzed:

🐹 Go (*_test.go): 2 tests — both unit (//go:build !integration) ✅ build tags present on all files
🟨 JavaScript (*.test.cjs, *.test.js): 0 tests changed

Verdict

✅ Check passed. 0% of new tests are implementation tests (threshold: 30%). Both new/modified tests enforce behavioral contracts with good edge-case coverage.

📖 Understanding Test Classifications

Design Tests (High Value) verify what the system does:

Assert on observable outputs, return values, or state changes
Cover error paths and boundary conditions
Would catch a behavioral regression if deleted
Remain valid even after internal refactoring

Implementation Tests (Low Value) verify how the system does it:

Assert on internal function calls (mocking internals)
Only test the happy path with typical inputs
Break during legitimate refactoring even when behavior is correct
Give false assurance: they pass even when the system is wrong

Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.

References: §24729751175

🧪 Test quality analysis by Test Quality Sentinel · ● 395.8K · ◷

github-actions

✅ Test Quality Sentinel: 90/100. Test quality is excellent — 0% of new tests are implementation tests (threshold: 30%). Both changed tests enforce behavioral contracts with solid edge-case coverage. Minor test inflation noted in sanitize_test.go (2.9:1 ratio) but justified by meaningful table-driven edge cases.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

github-actions · 2026-04-21T15:13:41Z

✅ Smoke CI completed successfully!

github-actions · 2026-04-21T15:15:21Z

✅ smoke-ci: safeoutputs CLI comment only run (24730350270)

Generated by Smoke CI for issue #27584 · ◷

Initial plan

eb4867e

Copilot AI assigned Copilot and pelikhan Apr 21, 2026

Copilot started work on behalf of pelikhan April 21, 2026 14:13 View session

Copilot AI linked an issue Apr 21, 2026 that may be closed by this pull request

[refactor] Semantic Function Clustering: Refactoring Opportunities in pkg/ #27551

Closed

7 tasks

refactor: clarify workflow and code identifier sanitizers

a7edd1d

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/b76bdebf-72b0-48e3-8f17-92d521a5e130 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Refactor opportunities identified in pkg/ based on semantic clustering~~ Refactor sanitizer APIs to separate artifact identifiers from code identifiers Apr 21, 2026

Copilot finished work on behalf of pelikhan April 21, 2026 14:27

Copilot AI requested a review from pelikhan April 21, 2026 14:27

pelikhan marked this pull request as ready for review April 21, 2026 15:00

Copilot AI review requested due to automatic review settings April 21, 2026 15:00

Copilot started reviewing on behalf of pelikhan April 21, 2026 15:01 View session

github-actions Bot mentioned this pull request Apr 21, 2026

[aw] No-Op Runs #27396

Closed

Copilot AI reviewed Apr 21, 2026

View reviewed changes

Comment thread pkg/stringutil/sanitize.go Outdated

Comment thread pkg/stringutil/README.md Outdated

Comment thread pkg/workflow/mcp_renderer_github.go Outdated

github-actions Bot approved these changes Apr 21, 2026

View reviewed changes

pelikhan and others added 3 commits April 21, 2026 08:12

Update mcp_renderer_github.go

d70513a

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update README.md

fd02f18

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update sanitize.go

1ee89f4

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

pelikhan merged commit d30542e into main Apr 21, 2026
15 of 17 checks passed

pelikhan deleted the copilot/refactor-semantic-function-clustering-another-one branch April 21, 2026 15:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor sanitizer APIs to separate artifact identifiers from code identifiers#27584

Refactor sanitizer APIs to separate artifact identifiers from code identifiers#27584
pelikhan merged 5 commits intomainfrom
copilot/refactor-semantic-function-clustering-another-one

Copilot AI commented Apr 21, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Apr 21, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

github-actions Bot commented Apr 21, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Apr 21, 2026

🧪 Test Quality Sentinel Report

Test Quality Score: 90/100

Test Classification Details

Test Detail Notes

Flagged Items — Informational Only

⚠️ Test Inflation: pkg/stringutil/sanitize_test.go (informational)

Language Support

Verdict

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Apr 21, 2026 •

edited

Loading

⚠️ Test Inflation: `pkg/stringutil/sanitize_test.go` (informational)

github-actions Bot commented Apr 21, 2026 •

edited

Loading